0% found this document useful (0 votes)

47 views288 pages

Numerical and Statistical Methods - CRC

The document outlines the syllabus for the Numerical and Statistical Methods course for BSc (Information Technology) at Bharathidasan University, effective from 2021-22. It includes topics such as floating point arithmetic, interpolation, numerical integration, probability distributions, and statistical inference. The course aims to equip students with essential statistical tools and methodologies applicable in various fields including business, science, and engineering.

Uploaded by

YUSUF'S INFO

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views288 pages

Numerical and Statistical Methods - CRC

Uploaded by

YUSUF'S INFO

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 288

NUMERICAL AND STATISTICAL

METHODS

BSc (Information Technology)

First Year
Allied Paper I

Bharathidasan University
Centre for Distance and Online Education
Chairman:
Dr. M. Selvam
Vice-Chancellor
Bharathidasan University
Tiruchirappalli-620 024
Tamil Nadu
Co-Chairman:
Dr. G. Gopinath
Registrar
Bharathidasan University
Tiruchirappalli-620 024
Tamil Nadu

Course Co-Ordinator:
Dr. A. Edward William Benjamin
Director-Centre for Distance and Online Education
Bharathidasan University
Tiruchirappalli-620 024
Tamil Nadu
The Syllabus is Revised from 2021-22 onwards

Review er
Dr.T.Jai Sankar, Assistant Professor & Head, Department of Statistics, Bharathidasan University, Khajamalai Campus,
Tiruchirappalli-620023

Author
Kalika Patrai, Senior Lecturer in Galgotia Institute of Management and Technology, Greater Noida

"The copyright shall be vested with Bharathidasan University"

All rights reserved. No part of this publication which is material protected by this copyright notice
may be reproduced or transmitted or utilized or stored in any form or by any means now known or
hereinafter invented, electronic, digital or mechanical, including photocopying, scanning, recording
or by any information storage or retrieval system, without prior written permission from the Publisher.

Information contained in this book has been published by VIKAS® Publishing House Pvt. Ltd. and has
been obtained by its Authors from sources believed to be reliable and are correct to the best of their
knowledge. However, the Publisher, its Authors shall in no event be liable for any errors, omissions
or damages arising out of use of this information and specifically disclaim any implied warranties or
merchantability or fitness for any particular use.

Vikas® is the registered trademark of Vikas® Publishing House Pvt. Ltd.

VIKAS® PUBLISHING HOUSE PVT LTD
E-28, Sector-8, Noida - 201301 (UP)
Phone: 0120-4078900 � Fax: 0120-4078999
Regd. Office: A-27, 2nd Floor, Mohan Co-operative Industrial Estate, New Delhi 1100 44
� Website: www.vikaspublishing.com � Email: [email protected]
SYLLABI-BOOK MAPPING TABLE
Numerical and Statistical Methods

Syllabi Mapping in Book

Numerical methods: Floating point arithmetic; Basic Unit 1: Errors and Floating Point
concept of floating point numbers systems, implications of Arithmetic
(Pages 3–28);
finite precision, illustrations of errors due to round off.
Interpolation Finite difference calculus, polynomial Unit 2: Interpolation
interpolation, approximation uniform, discrete least square, (Pages 29–84);
polynomial, fourier. Unit 3: Numerical Integration and
Numerical integration and Differentiation interpolatory Differential Interpolation
(Pages 85–123)
numerical integration; Numerical differentiation.

Solution of non-linear: Bisection, fixed point iteration, Unit 4: Solution of Algebraic and
Newton’s Rephsons methods. Transcendental Equations
(Pages 125–163);
Solution of Ordinary Differential Equation:Taylor series
method, Runge-Kutta method, Euler method. Unit 5: Numerical Solution to
Ordinary Differential
Equations
(Pages 165–194)

Random variables and their distributions: Random Unit 6: Probability Distribution

variables (discrete and continuous), probability density and (Pages 195–223);
distribution functions, special distributions (Binomial Unit 7: Approximation Theory
distribution functions, Poisson, special distributions, (Pages 225–244);
uniform exponential), mean and variance, Chebyshev
Unit 8: Statistical Inferences
inequality, independent random variables, functions of (Pages 245–279)
random variables and their distribution. Limit Theorems:
Poisson and normal approximations. Control limit theorem
law of large numbers.
Statistical inference: Estimate and sampling, point and
interval estimate of hypothesis testing, power of a test,
regression.
CONTENTS
INTRODUCTION 1–2

UNIT 1 ERRORS AND FLOATING POINT ARITHMETIC 3–28

1.0 Introduction
1.1 Unit Objectives
1.2 Approximate Numbers
1.2.1 Precision
1.2.2 Rounding Off
1.2.3 Defining Precision in Hardware
1.2.4 Compiler Evaluation of Mixed Precision
1.3 Errors
1.3.1 Error in the Approximation of a Function
1.3.2 Error in a Series Approximation
1.3.3 Errors in Numerical Computations
1.4 Floating Point Representation of Numbers
1.4.1 Arithmetic Operations with Normalized Floating Point Numbers
1.4.2 Drawbacks of Floating Point Representation
1.5 Summary
1.6 Key Terms
1.7 Answers To ‘Check Your Progress’
1.8 Questions and Exercises
1.9 Further Reading

UNIT 2 INTERPOLATION 29–84

2.0 Introduction
2.1 Unit Objectives
2.2 Polynomial Interpolation
2.2.1 Finite Differences
2.2.2 Differences of a Polynomial
2.2.3 Some Useful Symbols
2.3 Missing Term Technique
2.3.1 Effect of an Error on a Difference Table
2.4 Newton’s Formulae for Interpolation
2.4.1 Newton–Gregory Forward Interpolation Formula
2.4.2 Newton–Gregory Backward Interpolation Formula
2.5 Central Difference Interpolation Formula
2.5.1 Gauss’s Forward Difference Formula
2.5.2 Gauss’s Backward Difference Formula
2.5.3 Stirling’s Formula
2.5.4 Bessel’s Interpolation Formula
2.5.5 Lagrange’s Interpolation Formula
2.6 Newton’s Divided Difference Interpolation Formula
2.7 Discrete Least Squares Approximation
2.7.1 Fourier Series
2.8 Summary
2.9 Key Terms
2.10 Answers to ‘Check Your Progress’
2.11 Questions and exercises
2.12 Further reading

UNIT 3 NUMERICAL INTEGRATION AND DIFFERENTIAL 85–123

INTERPOLATION
3.0 Introduction
3.1 Unit Objectives
3.2 Numerical Differentiation
3.3 Differentiating a Tabulated Function
3.4 Differentiating a Graphical Function
3.5 Numerical Integration Interpolation Formulae
3.5.1 Newton–Cotes Quadrature Formulae
3.5.2 Trapezoidal Rule (n = 1)
3.5.3 Simpson’s (1/3) Rule (n = 2)
3.5.4 Boole’s Rule (n = 4)
3.5.5 Weddle’s Rule (n = 6)
3.5.6 Gauss Quadrature Formulae
3.5.7 Errors in Quadrature Formulae
3.6 Approximation Theory
3.7 Summary
3.8 Key Terms
3.9 Answers to ‘Check Your Progress’
3.10 Questions and Exercises
3.11 Further Reading

UNIT 4 SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL

EQUATIONS 125–163
4.0 Introduction
4.1 Unit Objectives
4.2 Types of Non-linear Equations
4.3 Intermediate Value Theorem
4.4 Methods of Finding Solutions of Algebraic and Transcendental Equations
4.4.1 Direct Methods
4.4.2 Iterative Method
4.4.3 Initial Guess
4.4.4 Rate of Convergence
4.5 Bisection Method
4.5.1 Minimum Number of Iterations Required in Bisection Method to Achieve the Desired Accuracy
4.5.2 Convergence of Bisection Method
4.6 Regula–Falsi Method or Method of False Position
4.6.1 Convergence of Regula–Falsi Method
4.7 Secant Method
4.8 Fixed Point Iteration Method
4.8.1 Condition for Convergence of Iteration Method
4.8.2 Convergence of Iteration Method
4.9 Newton–Raphson Method
4.9.1 Geometrical Interpretation
4.9.2 Convergence of Newton–Raphson Method
4.9.3 Newton–Raphson Method for System of Non-Linear Equations
4.9.4 Generalized Newton’s Method for Multiple Roots
4.10 Summary
4.11 Key Terms
4.12 Answers to ‘Check Your Progress’
4.13 Questions and Exercises
4.14 Further Reading

UNIT 5 NUMERICAL SOLUTION TO ORDINARY DIFFERENTIAL

EQUATIONS 165–194
5.0 Introduction
5.1 Unit Objectives
5.2 Picard’s Method of Successive Approximations
5.3 Taylor’s Series Method
5.4 Euler’s Method
5.5 Runge–Kutta Method
5.5.1 Runge–Kutta Second Order Method
5.5.2 Runge–Kutta Fourth Order Method
5.6 Predictor–Corrector Methods
5.6.1 Modified Euler’s Method
5.6.2 Milne’s Predictor–Corrector Method
5.7 Summary
5.8 Key Terms
5.9 Answers to ‘Check Your Progress’
5.10 Questions and Exercises
5.11 Further Reading

UNIT 6 PROBABILITY DISTRIBUTION 195–223

6.0 Introduction
6.1 Unit Objectives
6.2 Classical Approch to Probability
6.2.1 Sample Space
6.3 Random Variables
6.3.1 Types of Random Variables
6.3.2 Joint and Marginal Probability Density Function
6.4 Discrete Theoretical Distributions
6.4.1 Binomial Distribution
6.4.2 Moments
6.4.3 Poisson Distribution with Mean and Variance
6.4.4 Uniform Distribution
6.5 Summary
6.6 Key Terms
6.7 Answers to ‘Check Your Progress’
6.8 Questions and Exercises
6.9 Further Reading
UNIT 7 APPROXIMATION THEORY 225–244
7.0 Introduction
7.1 Unit Objectives
7.2 Taylor’s Series Representation
7.3 Chebyshev Polynomials and Inequality
7.3.1 Chebychev Inequality
7.4 Distribution
7.4.1 Central Limit Theorem
7.5 Laws of Large Numbers
7.5.1 Weak Law of Large Numbers
7.5.2 Strong Law of Large Numbers
7.6 Normal Approximation
7.7 Summary
7.8 Key Terms
7.9 Answers to ‘Check Your Progress’
7.10 Questions and Exercises
7.11 Further Reading

UNIT 8 STATISTICAL INFERENCES 245–279

8.0 Introduction
8.1 Unit Objectives
8.2 Sampling Theory
8.2.1 The Two Concepts: Parameter and Statistic
8.2.2 Objects of Sampling Theory
8.2.3 Sampling Distribution
8.2.4 The Concept of Standard Error (or S.E.)
8.2.5 Procedure of Significance Testing
8.3 Test of Hypothesis
8.4 Test of Significance
8.4.1 Two-Tailed and One-Tailed Test
8.4.2 Testing a Hypothesis
8.5 Regression
8.5.1 Regression: Definition
8.5.2 Regression Coefficients
8.5.3 Non-Linear Regression
8.5.4 Multiple Linear Regression
8.5.5 Goodness of Fit
8.5.6 Estimate
8.6 Summary
8.7 Key Terms
8.8 Answers to ‘Check Your Progress’
8.9 Questions and Exercises
8.10 Further Reading
Introduction
INTRODUCTION
Statistics as a subject has made it possible to apply mathematical tools in business
environments. This is an advanced form of mathematical application that finds an NOTES
answer to those problems that can not be handled by analytical methods of
mathematics. Statistics is now-a-days applied widely in scientific and engineering
calculations. Advent of computers has equipped statistics with more powerful
tools for analyzing business, scientific and engineering calculations. Statistics works
on real life data and finds a methodology that can be uniformly applied in a situation.
Now statistical analysis can be done to extract much useful information connected
to problems in scientific research, engineering calculations and risk analysis in
business environment. Effectiveness of a method can also be tested from consistency
of the field data and also in taking a decision following a scientific approach to
problem. In the developing stage, statistics was limited to collection of data for
planning and organization of projects pertaining to the welfare of the society on a
large scale and for military projects. By the end of 19th century, statistics extended
beyond simple data collection and record keeping to interpretation of data and
drawing useful conclusions from it. Today statistics is being used for making
decisions both at corporate or individual level. Area of application has extended
to productivity, marketing and employment.
This course has been designed with an objective of familiarizing students
with the concepts related to:
• Precision, approximations and error in computations,
• Various methods used for interpolations,
• Numerical methods of differentiation and integration,
• Probability distribution,
• Statistical inferences from collected data, and
• Regression analysis.
There is a close relation between probability and statistics and probability.
Theory of probability carries studies on statistical phenomena such as correlation
and regression, sampling methods, business decisions and statistical inferences,
and analyses them. Probability deals with theory of chance whereas statistics is an
advanced tool of mathematical science that is related to collection of data, operation
on data to carry out analysis, interpretation or explanation as well as presentation
of data according to nature of the problem to be handled. This can be categorized
as Inferential Statistics or Descriptive Statistics. Statistical analysis is very important
for taking decisions and is widely used by academic institutions, natural and social
sciences departments and government and business organizations.
Statistic helps a decision-maker, with a set of limited information, for analyzing
the risk involved and selecting a strategy that carries minimum risk. In scientific

Self-Instructional Material 1
Introduction and engineering problem accuracy of results, approximations and errors introduced
carry great importance and statistics deals primarily with numerical data gathered
from surveys or collected using various statistical methods. Its objective is to
summarize such data, so that the summary can give us an indication of certain
NOTES characteristics of a population or phenomenon that we wish to study. This book
Scientific and Statistical Computing has been prepared for students in a
professional manner providing helpful and relevant material in a lucid, self-
explanatory and simple language to help them understand the concepts easily.
Relevant statistical tables have been given in the appendix for students refer
to it for solving statistical problems with accuracy. The concepts have been analyzed
in a logical format, beginning with an overview that helps readers to easily grasp
the concept. This is followed by explanations and solved examples. Questions
and examples in the ‘Check Your Progress’ and ‘Questions and Exercises’ sections
will further help in understanding and recapitulation of the subject.

2 Self-Instructional Material
Errors and Floating Point

UNIT 1 ERRORS AND FLOATING Arithmetic

POINT ARITHMETIC
NOTES
Structure
1.0 Introduction
1.1 Unit Objectives
1.2 Approximate Numbers
1.2.1 Precision
1.2.2 Rounding Off
1.2.3 Defining Precision in Hardware
1.2.4 Compiler Evaluation of Mixed Precision
1.3 Errors
1.3.1 Error in the Approximation of a Function
1.3.2 Error in a Series Approximation
1.3.3 Errors in Numerical Computations
1.4 Floating Point Representation of Numbers
1.4.1 Arithmetic Operations with Normalized Floating Point Numbers
1.4.2 Drawbacks of Floating Point Representation
1.5 Summary
1.6 Key Terms
1.7 Answers To ‘Check Your Progress’
1.8 Questions and Exercises
1.9 Further Reading

1.0 INTRODUCTION

Floating point arithmetic describes a system for representing numbers that would
be too large or too small to be represented as integers.
In this unit, you will learn that there are many situations where analytical
methods are unable to produce the desired results. These limitations of analytical
methods in practical applications have led mathematicians to develop certain
numerical methods.
Hence, the aim of numerical techniques is to provide constructive methods
for obtaining answers to such problems in a numerical form. However, in most
applications that make use of numerical techniques, the input data is not always
exact. It comes from some measurement or the other and hence contains some
error. The input data refers to the approximations, precise upto a certain number
of significant digits. These approximations, as a result, introduce error in the final
computed result.
The error in the final result may be due to an error in the initial data or in the
method or both. Therefore, the use of approximate numbers and/or approximate
methods gives us a solution that is not exact but approximate. So, the main objective

Self-Instructional Material 3
Errors and Floating Point is to obtain an approximate solution of the problem which would be closest to the
Arithmetic
exact solution.
This unit will discuss the concept of approximate numbers, significant digits,
floating point numbers and operations performed on them. Also, you will learn
NOTES about different types of errors, their sources and propagation of errors in numerical
calculations.

1.1 UNIT OBJECTIVES

After going through this unit, you will be able to:

• Understand approximate numbers
• Explain precision
• Know rounding-off errors
• Identify errors in numerical computations
• Understand floating point representation of numbers

1.2 APPROXIMATE NUMBERS

Approximate numbers occur as output of some measurement or calculation. The

concepts of precision and accuracy form the basis for the rules which govern
calculation with approximate numbers resulting from measurement.
There are two types of numbers:
5
• Exact numbers such as 2, 3, 4, 10, 15, ... –
, 6.45, ..., etc.
2
1
• Approximate numbers such as π (= 3.141592 ...), (=0.3333 ...),
3
2 (= 1.414213 ...).
Thus, approximate numbers are those numbers which cannot be expressed

by finite number of digits. The numbers π, 1 and 2 may be approximated by

3
3.14159, 0.33333 and 1.414213, respectively. They represent the given numbers,
1
i.e., π, and 2 to a certain degree of accuracy. So, in other words, an
3
approximate number is one which represents the exact number to a certain degree
of accuracy. Significant digits give the idea of the accuracy of approximate numbers.
1.2.1 Precision
Precision is shown by number of significant digits in a number that is closest to its
value. This includes all digits except:

4 Self-Instructional Material
• Leading and trailing zeros (unless a decimal point is present) Errors and Floating Point
Arithmetic
where they serve merely as placeholders to indicate the scale of the
number.
• Spurious digits introduced, for example, by calculations carried out to
greater accuracy than that of the original data or measurements NOTES
reported to a greater precision than the equipment supports.
Significant digits are used to express a number. For example, the numbers
3456, 34.56, 0.3456 have four significant digits each, whereas the numbers
0.00345, 0.003456 and 0.00034567 have three, four and five significant digits,
respectively, since zeros are used only to fix the position of the decimal point.
Similarly the number 0.34560 has only five significant digits. So,you can
say that ‘0’ is a significant digit except when it is used to fix decimal points or to fill
the places of unknown or discarded digits.
The best way to identify the significant digits in a given number is to write
the number in the scientific notation with the first digit being non-zero. The numbers
0.003456, 345600 and 345.603 can be expressed by using scientific notation as
0.3456 × 10–2, 0.3456 × 106 and 0.345603 × 103, respectively. The first part is
known as Mantissa and the second part, i.e., 10n is known as exponent.

• 3 4 5 6 ×10 –2
Mantissa Exponent
The digits in mantissa part are the only significant digits in the number.
1.2.2 Rounding Off

22
Consider a number say = 3.142857143.... In practice, you need to limit such
7
numbers to a manageable number of digits such as 3.1428 or 3.143. This process
of dropping the unwanted digits is called rounding off.
There are some guidelines or rules for rounding off a number to n-significant
digits. These guidelines are as follows:
1. Discard the digits to the right of the nth digit.
2. If the discarded number is:
(i) Less than half a unit in the nth place, leave the nth digit as it is
(i.e., unchanged). For example, the number 6.4326 when rounded-
off to three significant digits is 6.43 as the discarded number is
less than half a unit in the nth place.
(ii) Greater than half a unit in the nth place, increase the nth digit by
unity (i.e., add 1 to the nth digit).
For example,
6.43267 ~ 6.4327
3.2589 ~ 3.26
Self-Instructional Material 5
Errors and Floating Point (iii) Exactly half a unit in the nth place, increase the nth digit by unity if
Arithmetic
it is odd otherwise leave it unchanged.
For example,
6.4354 ~ 6.43
NOTES
5.8235 ~ 5.823
Thus, the numbers rounded off to n significant digits are said to be correct
or exact to n significant digits.
1.2.3 Defining Precision in Hardware
In the field of computing, floating point denotes a way to represent numbers that
are too small or too large when we represent these as integers. When using
computers for representing precision such numbers are, generally represented
using some fixed number of significant digits and these are scaled by use of an
exponent. Base used for scaling is usually 2, 10 or 16. Exact representation of a
typical number has the form:
Significant digits × baseexponent
Floating point is so called since the radix point can ‘float’, which means that
it may be placed anywhere in relation to significant digits for the number and such
position is shown separately in some kind of internal representation in the hardware.
Thus, representation of floating-point is a computer implementation of scientific
notation. There are different systems for representing floating-point in computers
and most common representation has been defined by standard IEEE 754.
With representation of integral numbers also, maximum number and minimum
number is limited by capability of computer hardware in storing such numbers. But
use of floating-point can support wider range of numerical values than that of
fixed-point representation. If we use a fixed-point representation having six decimal
digits and we assume positioning of decimal point after fourth digit, we may represent
numbers 1234.56, 3765.43 and like that. But while representing a floating-point
with six decimal digits we may take it as 1.23456, 12345.6, 0.0000123456,
123456000000000, and like that. Computer hardware needs more storage for
representing floating point numbers. Speed for floating-point operations is used to
measure performance of computers and it is measured as FLOPS.
By making radix point adjustable, notation for floating-point permits
calculations of different magnitudes with the use of some fixed number of digits
and this can maintain good precision. For example, a decimal floating-point system
using three digits, as decimal multiplication may be written as
0.11 × 0.11 = 0.0121 and this can be written as:
(1.1 × 10 ) × (1.1 × 10–1) = (1.21 × 10–2).
–1

But when a fixed point system having decimal point put on the left side, it
becomes 0.110 × 0.110 = 0.012.

6 Self-Instructional Material
Thus, due to the limit put for representing digits, one digit in the result has Errors and Floating Point
Arithmetic
been lost. This is so as decimal point and digits do not float relative to each other.
Range of floating-point numbers is dependent on number of digits or bits
used to represent significant digits which is known as significand, and an exponent.
NOTES
In a typical computer system, that uses 64 bits for representing a ‘double precision’
number, the binary floating-point number uses 53 bits with one bit for sign and 11
bits for exponent a positive floating-point number has a range of approximately
10–308 to 10308 and 308 is nearly 1023 × log10(2), as range of exponent is form
–1022 to 1023. This format has complete range of –10308 to +10308 as per IEEE
754.
In the field of computer science, a technique known as arbitrary-precision
arithmetic is used in which calculations are made using where available memory
of the computer system limits digits of precision. This has contrasted that of faster
fixed-precision arithmetic available in most hardware of ALU (Arithmetic and
Logic Unit) having a typical range of 6 to 16 decimal digits.
Standardization has been done by IEEE for computer representation of
binary floating-point numbers. Most modern machines follow this standard. IBM
mainframe is an exception to this and has its own format.
Precision defined by IEEE 754 for floating point representation are:
• Half precision: 16-bit.
• Single precision: 32-bit binary32 and decimal32
• Double precision: 64-bit, binary64 and decimal64
• Quadruple precision: 128-bit: binary128 and decimal128
Number of normalized floating point numbers for a system is given by
F(B, P, L, U), where B denotes the number base, P, the precision of numbers, L,
the smallest exponent and U, the largest exponent that can be represented and is
given as:
2 * (B – 1) * B^(P – 1) * (U – L + 1).
UFL (Underflow Level) = B^L is the least positive normalized floating-
point number and has a 1 as leading digit and 0 for remaining digits in the mantissa
with least possible value for exponent.
Also, OFL (Overflow Level) = B^(U + 1) * (1 – B^(–P)) which is the
maximum floating point number having B - 1 as value for every digit of mantissa
and maximum possible value for exponent. In addition to these, actual values that
are represented lie between –UFL and UFL.
A number is represented by specifying some way for storage of a number
encoded as a string of digits. Logically, a floating-point number has:
• A digit with sign for some given length to a given base also known as radix.
This is called significand, or mantissa or coefficient. Length of significand
gives the precision by which numbers are represented.
Self-Instructional Material 7
Errors and Floating Point • A signed integer exponent called scale or characteristic that modifies the
Arithmetic
number’s magnitude.
The number is represented as: significant digits × baseexponent.
NOTES ‘Machine precision’ shows a quantity characterizing accuracy of a floating
point system. It is also called machine epsilon and generally denoted as
Emach whose value depends on rounding off method and when rounding to zero,
Emach = B^(1– P) but when rounding is done to nearest digits, Emach = (1/2)*B^(1
– P)
Such a concept is important as it bounds relative error in showing a non-
zero real number x within normalized range of a floating point system, such as:
|(f l(x) – x)/x| < = Emach
Storage Layout
There are three basic parts of IEEE floating point numbers; Sign, Exponent, and
Mantissa. Mantissa is fraction with implicit leading digit. If base of the exponent is
2, it is implicit and this does not require storage.
Table below shows layout for single precision using 32 bits and double
precision having 64-bit floating-point values. Numbers of bits in each field have
also been shown. Square brackets have been used to denote bit ranges:

Sign Exponent Fraction Bias

Single Precision 1 [31] 8 [30-23] 23 [22-00] 127
Double Precision 1 [63] 11 [62-52] 52 [51-00] 1023

The Sign Bit

Only one bit is used for denoting sign; 0 for positive number and 1 for negative
number.

The Exponent
This field has to denote positive as well as negative exponents. Hence, a bias is
put in actual exponent for storing exponent. In case of IEEE single-precision floats,
it is 127. If an exponent is zero it tells that 127 is stored for exponent field. If
stored value is 200, it shows exponent as 200-127 = 73. The exponents of -127,
in which all bits are 0s and +128, in which all bits are 1s and as such these are kept
reserved and stand as special numbers.
In case of double precision, exponent field contains 11 bits having a bias of
1023.

8 Self-Instructional Material
The Mantissa Errors and Floating Point
Arithmetic
Mantissa is known by another name significand, and it shows precision bits in the
number. It contains an implicit leading bit and fraction bits.
1.2.4 Compiler Evaluation of Mixed Precision NOTES

Hardware puts a limit on accuracy depending on how much bits are used for
floating point representations. Different programming languages define different
ways to represent floating points in computer system. A program written in
programming language is compiled to create an object code that is a binary code
accepted by the computer machine.
Precision, in science and engineering, is defined as degree of agreement
amongst a group of individual measurements or results. Precision in computation
and accuracy that is found in the final result has a complicated relationship. Non
linear relations, in most of the cases lead to a situation in which improvement in
accuracy is different for different part of an algorithm. Different methods of numerical
analysis gives knowledge on such relations that can be used by reducing
computational precision in areas that are less sensitive and increasing in areas that
are very sensitive and use logarithm accordingly. Such an approach is known as
mixed precision. Thus, in this approach, different levels of precision are set for
different portions of an algorithm and compiler evaluates all.
Accuracy is given by the degree to which there is an agreement for a
measured value with that of actual value. If actual value is known, there is no need
to measure it as such and you may use it for determining accuracy of the measuring
instrument or tool. If actual value is not known, accuracy can be inferred for the
precision of the measurement. For example, if calculation is made for knowing
expenditure in terms of Rupee and paisa, two decimal digits are enough and it is
wise to calculate up to three decimal digits and then rounding to two digits in the
final result.
Hardware in a computer system gives a limit to which number can be
represented but its full potential is utilized by the software programs written in
some programming language. Support for number representation is provided by
hardware. While computing, precision is used differently. One approach is the
number of bits for representing mantissa of the floating point number. A single-
precision floating point number as per IEEE has 24 bits of precision that is equivalent
to 7 decimal digits. A double-precision has 53 bits of precision equivalent to 15
decimal digits.
Precision is also used to tell as to how many bits amongst these are correct.
In numerical analysis computation is formulated to improve precision and accuracy.
The order in which computations are made has bearing on accuracy. Accordingly,
in a programming language somewhere short integer is used whereas at some

Self-Instructional Material 9
Errors and Floating Point places long integer is used. Similar is the case of decimal numbers. If one is writing
Arithmetic
a program to calculate factorial, he has to see the limit to which a number can be
evaluated without overflow. Since compiler is capable of changing the order of
computations, it can affect the result.
NOTES
Floating point operations are slower. Multiplication operation takes more
time than addition. Square root or division takes ten times longer than addition.
Thus, to improve speed it is wise to reduce multiplication operation by replacing
with addition wherever possible and replacing division by multiplication. Current
microprocessors, Intel (SSE), AMD and PowerPC have built-in instructions for
approximating a single-precision inverse square root. In these cases, instructions
give a low precision approximation that is nearly equal to half the number of bits.
In iterations, used in Newton-Raphson method of numerical computations, results
can be improved to full precision.
Cray 1 implemented a method of reciprocal approximate in place of divide
instruction that required compiler to generate iterations in Newton-Raphson method to
give result with full precision. There are round-off errors in any computer floating point
arithmetic and careful steps has to be followed to avoid results that may be misleading.
There are many formats for floating point arithmetic from base 2 to 16
without or with rounding, with variation in its dynamic range. IEEE in eighties
made an effort for standardizing floating point formats that is now used by all
mainstream processors of today. This provided compatibility such that answers
computed with one type of processor; Intel, IBM or AMD will give the same
result on other brands of microprocessors such as Motorola, SPARC or MIPS.
But in practice, changing the processor changes computation and results produced
by different processors may be nearby but not exactly the same. A compiler
command ‘!PRECISION’sets the value of decimal-places to determine minimum
number when calculating and giving results.

CHECK YOUR PROGRESS

1. Define what an approximate number is.
2. What are significant digits?
3. What is the best way to identify the significant digits in a given number?

1.3 ERRORS

The use of approximate numbers introduces errors in numerical calculations. In

general, error is defined as the difference between exact and approximate values.
Thus,
Error E = | Exact value – Approximate value |

10 Self-Instructional Material
or, E = | Et – Ea | Errors and Floating Point
Arithmetic
Where, Et is true value or exact value and Ea is approximate value.
The following are the different types of errors that occur during numerical
computations:
NOTES
• Inherent errors
Errors that exist in a problem before it is solved are called inherent errors. These
errors occur due to incorrect measurements or observations that may be due to
the limitations of the measuring instrument such as mathematical tables, calculators
or the digital computer. These errors can be minimized by taking better data or by
using high precision computing aids.

• Truncation errors
The errors which occur when some digits from the number are discarded are
known as truncation errors. Truncation errors occur in two situations:
(i) When the numbers are represented in a normalized floating point form.
(ii) During the conversion of the number from one system to another.
While representing a number in normalized floating point form the error
occurs because only a few digits of the number are accommodated by the Mantissa
part. For example, the number 0.003456789 takes the form of 0.3456 × 10–2 in
its own hypothetical computer during the normalized floating point form.
The error occurs when a number, say 13.1 is represented in a binary form
as 1101.001100110011 ... . It has a repeating fraction and due to this repetition
the conversion is terminated after some digits and hence truncation error is
introduced.

• Round-off errors
These errors occur during the process of rounding off of a number. These are
unavoidable errors due to the limitations of computing. However, these errors can
be reduced by:
(i) Changing the calculation procedure so as to avoid subtraction of nearly
equal numbers.
(ii) Retaining at least one more significant digit at each step than that given in the
data and rounding off at the last step.

• Absolute error
Let Xt be the true value and Xa be the approximate value, then the positive difference
between Xt and Xa, i.e., | Xt – Xa | is known as absolute error, denoted by Ea.
So, E a = | Xt – Xa |

Self-Instructional Material 11
Errors and Floating Point
Arithmetic • Relative error
The relative error denoted by Er is defined as:
Absolute error | X t − X a |
NOTES Er = =
True value Xt
• Percentage error
The percentage error denoted by Ep is defined as:
E p = Relative error × 100
| Xt − Xa |
Ep = × 100
Xt
The relative and percentage errors are independent of the units used while
the absolute error is expressed in terms of these units.
1
Notes: 1. If a number is correct to n decimal places, then the error in it is × 10− n .
2
2. If a number is correct to n-significant digits with the first significant digit
being k, then the relative error Er < 1/(k × 10n – 1).
Example 1.1 If the number 852.47 is correct to five significant digits, then what
will be the relative error?
Solution: Here the first significant digit is k = 8 and total number of significant
digits is n = 5.
1
Then, absolute error = × 10 − m
2
where, m stands for the number that is correct to m decimal places.
1
Here, Ea = × 10−2 = 0.005
2
Ea 0.005 5
Thus, Er ≤ = =
852.47 852.47 852470
1
=
2 × 85247
One can observe that
1 1 1
Er = < =
2 × 85247 2 × 80000 2 × 8 × 104

1  1 
< , i.e.,  k × 10n − 1 
8 ×104  

12 Self-Instructional Material
Example 1.2 Round off the following numbers correct to four significant Errors and Floating Point
Arithmetic
digits:
0.00032217, 35.46735 and 18.265101 and compute Ea, Er and Ep in each case.
Solution: NOTES
(i) The number rounded off to four significant digits = 0.0003222
So, E a = | 0.00032217 – 0.0003222 | = 0.00000003 = 0.3 × 10–7
0.3 ×10−7
Er = = 0.000093119 = 0.93119 × 10–4
0.00032217
and E p = Er × 100 = 0.0093119 = 0.93119 × 10–2
(ii) The number rounded off to four significant digits = 35.47000
Then, E a = | 35.46735 – 35.47000 | = 0.00265
0.00265
Er = = 0.74717 × 10–4
35.46735
and E p = Er × 100 = 0.74717 × 10–2
(iii) The number is rounded off to four significant digits = 18.270000
Then, E a = | 18.265101 – 18.270000 | = 0.004899
0.004899
Er = = 0.00026822 = 0.26822 × 10–3
18.265101
and E p = Er × 100 = 0.26822 × 10–3 × 100 = 0.26822 × 10–1
1.3.1 Error in the Approximation of a Function
Let y = f(x1, x2), be a function of two variables x1 and x2.
Then, if δx1 and δx2 are the errors in x1 and x2, respectively, the error in δy
in y is given as:
y + δy = f (x1 + δx1, x2 + δx2) ...(1.1)
Expanding the right hand side of Equation (1.1) by using Taylor’s series,
you have

 ∂f ∂f 
y + δy = f (x1 + x2) +  . δx1 + . δx2  + ...
 ∂x1 ∂x2 
If the errors δx1 and δx2 are very small, you can neglect the higher order
terms of δx1 and δx2 then,

 ∂f ∂f 
y + δy = f (x1, x2) +  . δx1 + . δx2  ...(1.2)
 ∂x1 ∂x2 

Self-Instructional Material 13
Errors and Floating Point
Arithmetic  ∂f ∂f 
then, δy =  . δx1 + . δx2  approximately..
 ∂x1 ∂x2 

NOTES In general, the error δy in the function y = f (x1, x2, ..., xn) corresponding to
the errors δxi (i = 1, 2, ..., n) in xi is given by,

∂f ∂f ∂f
δy ∼ . δx1 + . δx2 + ... + . δxn
∂x1 ∂x2 ∂xn

the relative error Er in y is given by,

δy ∂f δ x1 ∂f δ x2 ∂f δ xn
Er = = . + . + ... + .
y ∂x1 y ∂x2 y ∂xn y

1.3.2 Error in a Series Approximation

The Taylor’s series for y = f (x) at x = a with a remainder after n terms is
f (x) = f ( a + x − a )
( x − a )2
= f (a) + ( x − a ) f ′(a) + f ′′(a)
2!
( x − a ) n −1 n −1
+ ... + f ( a) + Rn ( x)
( n − 1)!

( x − a)n n
where, Rn(x) = f (θ), a < θ < x
n!
If the series is convergent, Rn(x) → 0 as n → ∞ and hence if f(x) is
approximated by the first n terms of this series, then the remainder term Rn(x) of
the series gives the maximum error. If you have already received the accuracy that
is required in series approximations you can find out the number of terms, i.e., n to
obtain the desired accuracy.
Example 1.3 Find the number of terms of the exponential series such that their
sum gives the value of ex correct to six decimal places at x = 1.

x 2 x3 x n −1
Solution: ex = 1 + x + + + ... + + Rn ( x)
2! 3! (n − 1)!

xn θ
where, Rn(x) = e ,0<θ< x
n!

xn x
Maximum absolute error at θ = x is e
n!
14 Self-Instructional Material
Errors and Floating Point
xn Arithmetic
and, Maximum relative error =
n!
1
Hence, (er)max. at x = 1 is =
n! NOTES
For accuracy upto six decimal places at x = 1, you have
1 1
< × 10−6 , i.e., n! > 2 × 106
n! 2
This gives n = 10.
Hence, you need ten terms of the series to ensure that its sum is correct to
six decimal places.
Example 1.4 The function f(x) = tan–1x can be expanded as,
x3 x5 x2n − 1
tan–1x = x − + − ... + (−1)n − 1.
3 5 2n − 1
Find n such that the series determines tan–1(1) correct upto eight
significant digits.
x 2n + 1
Solution: If you retain n terms, then (n + 1)th term= (–1)n .
2n + 1
(−1)n
For x = 1, (n + 1)th term =
2n + 1
For the determination of tan–1 (1) correct up to eight significant digits accuracy,

(−1) n 1 −8
< × 10
2n + 1 2
⇒ 2n + 1 > 2 × 108
This gives n = 108.
Example 1.5 Given a value of x = 3.51 with an error δx = 0.001, estimate the
resulting error in the function y = x3.
Solution: You have y = x3
δy = 3x2 . δx
= 3(3.51)2 . (0.001) = 0.03696
Since, f (3.51) = 43.2436
Let us assume that f (3.51) = 43.2436 ± 0.03696
In other words, you can say that the value of the function y = x3 at x = 3.51
lies between (43.2436 + 0.03696) = 43.28056 and (43.2434 – 0.03696)
= 43.20664 if x has an error of 0.001.

Self-Instructional Material 15
Errors and Floating Point
Arithmetic 8xy 2
Example 1.6 Let R = 3 and ∆x = ∆y = ∆z = 0.001. Find the maximum
z
relative error in R at x = y = z = 1.
NOTES Solution: Given that ∆x = ∆y = ∆z = 0.001 then the relative error in R is given as,
∂R ∂R ∂R
δR = · δx + · δy + · δz
∂x ∂y ∂z

8 y2 16 xy 24 xy 2
= 3 · δx + 3 · δy − · δz
z z z4
Since the errors δx, δy and δz (i.e. ∆x, δy, ∆z) may be positive or negative,
you take the absolute values of the terms on the right hand side which gives

8 y2 16 xy 24 xy 2
(δR)max = δx + δy + δz
z3 z3 z4
= 8(0.001) + 16(0.001) + 24(0.001)
= 0.048
Hence, the maximum relative error is given as,
δR 0.048
= = 0.006
R 8
1.3.3 Errors in Numerical Computations

Error in addition of numbers

Let, X = x1 + x2 + ... + xn
∴ X + ∆X = (x1 + ∆x1) + (x2 + ∆x2) + ... + (xn + ∆xn)
The absolute error is,
∴ ∆X = ∆x1 + ∆x2 + ... + ∆xn
∆X ∆x ∆x ∆x
⇒ = 1 + 2 + ... + n
X X X X
this is the relative error.
The maximum relative error is,

∆X ∆x1 ∆x2 ∆xn

≤ + + ... + .
X X X X
It is clear that if two numbers are added then the magnitude of absolute
error in the result is the sum of the magnitudes of the absolute errors in the two
numbers.

16 Self-Instructional Material
Notes: Let us follow the given procedure for adding up – numbers of different Errors and Floating Point
Arithmetic
absolute accuracies:
1. Isolate the number with the greatest absolute error.
2. Round off all other numbers, retaining in them one digit more than in the NOTES
isolated number.
3. Add up.
4. Round off the sum by discarding one digit.

Error in subtraction of numbers

Let, X = x1 – x2
∴ X + ∆X = (x1 + ∆x1) – (x2 + ∆x2)
= (x1 – x2) + (∆x1 – ∆x2)
∴ ∆X = ∆x1 – ∆x2 is the absolute error.
∆X ∆x ∆x
and, = 1 − 2 is the relative error..
X X X
∆X ∆x1 ∆x2
Maximum relative error = ≤ +
X X X
Maximum absolute error = | ∆X | ≤ | ∆x1 | + | ∆x2 |
Error in product of numbers
Let, X = x1, x2 , ..., xn
It is known that if X is a function of x1, x2, ..., xn
∂X ∂X ∂X
Then, ∆X = ∆x1 + ∆x2 + ∆xn
∂x1 ∂x2 ∂xn

∆X 1 ∂X 1 ∂X 1 ∂X
Now, = ∆x1 + ∆x2 + ... + ∆xn
X X ∂x1 X ∂x2 X ∂xn

1 ∂X x . x ... xn 1
Now, = 2 3 =
X ∂x1 x1. x2 . x3...xn x1

1 ∂X x .x ... xn 1
= 1 3 =
X ∂x2 x1 . x2 . x3...xn x2

M M
1 ∂X 1
=
X ∂xn xn

Self-Instructional Material 17
Errors and Floating Point
∆X ∆x ∆x ∆x
= 1 + 2 + ... + n
Arithmetic
∴
X x1 x2 xn
∴ The relative and absolute errors are given by,
NOTES
∆X ∆ x1 ∆ x2 ∆ xn
Maximum relative error = ≤ + + ... +
X x1 x2 x3

∆X
Maximum absolute error = X
X

∆X
= . (x1x2x3 ... xn)
X

Error in division of numbers

x1
Let, X =
x2

∆X 1 ∂X 1 ∂X
∴ = ∆x1 + . ∆x2
X X ∂x1 X ∂x2

∆x1 1 ∆x2  − x1  ∆x1 ∆x2

= . +
 x1  x2  x1   x22  x1
= −
x2
x  x 
 2  2

∆X ∆x1 ∆x2
∴ ≤ + which is relative error..
X x1 x2

∆X
Absolute error = | ∆X | ≤ .X
X
Error in evaluating xk
X = xk, where k is an integer or fraction
dX
∆X = ∆x = kx k − 1. ∆x
dx

∆X ∆x
= k.
X x

∆x
The relative error in evaluating xk = k .
x
18 Self-Instructional Material
Errors and Floating Point
Arithmetic
CHECK YOUR PROGRESS
4. How are errors introduced in numerical calculations?
5. What is the result when two numbers are added? NOTES
6. Name the two formats in which real numbers can be written.

1.4 FLOATING POINT REPRESENTATION OF

NUMBERS

You know that real numbers can be written in two formats:

1. Fixed point (without exponent)
2. Floating point (with exponent)
Now assume a hypothetical computer having memory in which each location
can store six digits and has provision to store one or more signs. In that computation
a real number can be stored by assuming a fixed position for the decimal point and
hence all numbers will be stored after appropriate shifting if necessary with an
assumed decimal point (see Figure 1.1).
Sign
+↓
• 3 2 5 7 5 7
↑
Assumed decimal point position
Figure 1.1 A Memory Location Storing Number 3257.5
In this convention, the maximum possible number to be stored is 9999.99
and the minimum possible number to be stored is 000.01 in magnitude. This range
is quite inadequate in practice.
Thus, a new methodology is adopted that aims to preserve the maximum
number of significant digits in a real number and also increases the range of the real
numbers stored. In standard practice the decimal point can be moved freely to the
left or to the right. More convenient is to place the decimal point immediately
before the most significant digit (i.e., the first non-zero digit of the number). This
shifting of decimal point to the left of the most significant digit is known as
normalization and the real numbers represented in this form are called normalized
floating point numbers. Thus, this representation of numbers is known as normalized
floating point representation.
The normalized floating point numbers consist of two parts:
(i) Mantissa–the fractional part
(ii) Exponent

Self-Instructional Material 19
Errors and Floating Point For positive numbers mantissa ranges from 0.1 to 1.0 and for negative
Arithmetic
numbers it ranges from –1.0 to –0.1. In other words, mantissa of a normalized
floating point number must satisfy the following condition:
0.1 ≤ | Mantissa | ≤ 1.0
NOTES
For example, the number 0.3245 × 105 is represented in this form as: 0.3245
E5 where E5 is used to represent 105. Thus, here the mantissa is 0.3245 and
exponent is 5. This number is stored in hypothetical computer as shown in Figure
1.2.
Sign of → + + ← Sign of exponent
Mantissa • 3 2 4 5 0 5
↑ Mantissa | Exponent
Implied decimal point
Figure 1.2 A Memory Locating Storing Number

As another example, the number 0.0032517 may be stored as 0.3251

E–2 because the leading zeros only fix the position of the decimal point.
Notes: 1. In normalized floating point representation the range of numbers can
be stored as 0.9999 × 1099 to 0.1000 × 10–99 in magnitude.
2. The increment in range has been obtained by reducing the number of
significant digits by 2.
1.4.1 Arithmetic Operations with Normalized Floating Point Numbers
Addition and Subtraction
If the two numbers represented in a normalized floating point notations are to be
added, the exponents of the two numbers must be made equal and the mantissa
should be shifted appropriately. The operation of subtraction is nothing but the
addition of a negative number.
Example 1.7 Add the following floating point numbers:
(i) 0.4546 E 5 and 0.5433 E 5
(ii) 0.4546 E 5 and 0.5567 E 7
(iii) 0.4546 E 3 and 0.5567 E 7
(iv) 0.6434 E 3 and 0.4845 E 3
(v) 0.6434 E 99 and 0.4845 E 99.
Solution: (i) Here the exponents are equal.
∴ Mantissas are added.
∴ Sum = 0.4546 E 5 + 0.5433 E 5
= 0.9979 E 5

20 Self-Instructional Material
(ii) Here exponents are not equal. The operand with the larger exponent Errors and Floating Point
Arithmetic
is kept as it is.
0.5567 E 7
+ 0.0045 E 7 | 0.4546 E 5 = 0.0045 E 7 NOTES
0.5612 E 7
(iii) The addition will be as follows:
0.5567 E 7
+ 0.0000 E 7 | Q 0.4546 E 3 = 0.0000 E 7
0.5567 E 7
(iv) 0.6434 E 3
0.4845 E 3
1.1279 E 3 (not normalized)
Here exponents are equal but when the mantissas are added, the sum
is 1.1279 E 3. As the mantissa has five digits and is > 1, it is shifted
right one place before it is stored.
Hence, Sum= 0.1127 E 4
(v) 0.6434 E 99
0.4845 E 99
1.1279 E 99
Here, again the sum of the mantissas exceeds 1. The mantissa is shifted
right and the exponent is increased by 1, resulting in a value of 100 for
the exponent. The exponent part cannot store more than two digits.
This condition is called an overflow condition and the arithmetic unit
will intimate an error condition.
Example 1.8 Subtract the following floating point numbers:
(i) 0.9432 E – 4 from 0.5452 E – 3 (ii) 0.5424 E 3 from 0.5452 E 3
(iii) 0.5424 E – 99 from 0.5452 E – 99.
Solution: (i) 0.5452 E – 3
– 0.0943 E – 3
0.4509 E – 3
(ii) 0.5452 E 3
– 0.5424 E 3
0.0028 E 3
In a normalized floating point, the mantissa is ≥ 0.1
Hence, the result is 0.28 E 1

Self-Instructional Material 21
Errors and Floating Point (iii) 0.5452 E – 99
Arithmetic
– 0.5424 E – 99
0.0028 E – 99
NOTES For normalization, the mantissa is shifted left and the exponent is reduced
by 2. The exponent would thus become – 100 with the first left shift, which
cannot be accommodated in the exponent part of the number.
This condition is called an underflow condition and the arithmetic unit will
signal an error condition.
Notes: 1. If the result of an arithmetic operation gives a number smaller than
0.1000 E – 99 then it is called an underflow condition. Similarly, any
result greater than 0.9999 E 99 leads to an overflow condition.
2. Mantissa of the sum (before normalization) can be a maximum of
+1.9999,where at the most you need to shift the decimal point to the
left by one position in order to normalize it. As a result of this the
exponent of the sum is incremented by 1.
Example 1.9 In normalized floating point mode, carry out the following
mathematical operations:
(i) (0.4546 E 3) + (0.5454 E 8) (ii) (0.9432 E – 4) – (0.6353 E – 5).
Solution: (i) 0.5454 E 8
+ 0.0000 E 8 | Q 0.4546 E 3 = 0.0000 E 8
0.5454 E 8

(ii) 0.9432 E – 4
– 0.0635 E – 4 | Q 0.6353 E – 5 = 0.0635 E – 4
0.8797 E – 4

Multiplication
Two numbers are multiplied in the normalized floating point mode by multiplying
the mantissas and adding the exponents. After the multiplication of the mantissas,
the resulting mantissa is normalized as in an addition or subtraction operation and
the exponent is appropriately adjusted.
Example 1.10 Multiply the following floating point numbers:
(i) 0.5543 E 12 and 0.4111 E – 15
(ii) 0.1111 E 10 and .1234 E 15
(iii) 0.1111 E 51 and 0.4444 E 50
(iv) 0.1234 E – 49 and .1111 E – 54.
Solution: (i) 0.5543 E 12 × 0.4111 E – 15 = 0.2278 E – 3
(ii) 0.1111 E 10 × 0.1234 E 15 = 0.1370 E 24

22 Self-Instructional Material
(iii) 0.1111 E 51 × 0.4444 E 50 = 0.4937 E 100 Errors and Floating Point
Arithmetic
The result overflows.
(iv) 0.1234 E – 49 × 0.1111 E – 54 = 0.1370 E – 104
The result underflows. NOTES
Example 1.11 Apply the procedure for the following multiplications:
(i) (0.5334 × 109) × (0.1132 × 10–25)
(ii) (0.1111 × 1074) × (0.2000 × 1080)
Indicate if the result is overflow or underflow.
Solution: (i) 0.5334 E 9 × 0.1132 E – 25 = 0.6038 E – 17
(ii) 0.1111 E 74 × 0.2000 E 80 = 0.2222 E 153
Hence, the above result overflows.
Example 1.12 Multiply 0.1234 E – 75 by 0.1111 E – 37.
Solution: 0.1234 E – 75 × 0.1111 E – 37
= (0.1234 × 0.1111) E (–75 – 37)
= 0.01370974 E – 112 (before normalization)
= 0.1370 E – 113 (after normalization)
This is the case of underflow as the exponent is 113.

Division
In division, the mantissa of the numerator is divided by that of the denominator. The
denominator exponent is subtracted from the numerator exponent. The quotient mantissa
is normalized to make the most significant digit non-zero and the exponent is
appropriately adjusted. The mantissa of the result is chopped down to four digits.
Example 1.13 Perform the division operation on following operations:
(i) 0.9998 E 1 ÷ 0.1000 E – 99
(ii) 0.9998 E – 5 ÷ 0.1000 E 98
(iii) 0.1000 E 5 ÷ 0.9999 E 3.
Solution:
(i) 0.9998 E 1 ÷ 0.1000 E – 99 = 0.9998 E 101
Hence the result overflows.
(ii) 0.9998 E – 5 ÷ 0.1000 E 98 = 0.9998 E – 104
Hence the result underflows.
(iii) 0.1000 E 5 ÷ 0.9999 E 3 = 0.1000 E 2.
Example 1.14 Evaluate, applying normalized floating point arithmetic, for the
following:
1 – cos x at x = 0.1396 Radian

Self-Instructional Material 23
Errors and Floating Point Assume, cos (0.1396) = 0.9903
Arithmetic

Compare it when evaluated 2 sin2 x

2
NOTES Assume, sin (0.0698) = 0.6974 E – 1.
Solution: 1 – cos (0.1396) = 0.1000 E 1 – 0.9903 E 0
= 0.1000 E 1 – 0.0990 E 1 = 0.1000 E – 1
x
Now, sin = sin(0.0698) = 0.6974 E – 1
2
x
2sin 2 = (0.2000 E 1) × (0.6974 E – 1) × (0.6974 E – 1)
2
= 0.9727 E – 2
The value obtained by the alternate formula is closer to the true value
0.9728 E – 2.
Example 1.15 Divide 0.9998 E – 5 by 0.1000 E99
Solution: 0.9998 E – 5 ÷ 0.1000 E 99
= (0.9998 ÷ 0.1000) E(–5 – 99)
= 9.9980 E – 104 = 0.9998 E – 103
Since the mantissa of the division was greater than 1, so decimal point is
shifted one position to the left and the exponent is increased by 1. The resultant
quotient becomes 0.9998 E – 103 which is the case of underflow.
1.4.2 Drawbacks of Floating Point Representation
In a normalized floating point representation, the mantissa has to be truncated to
four digits to fit it in the format of the hypothetical computer. This truncation of
mantissa leads to a number of surprising results.
1. You have,
4x = x + x + x + x
But if this arithmetic is performed by using normalized floating point
representation, then this may not be true.
2. The associative and distributive laws of arithmetic may not hold true.
3. While performing arithmetic in a normalized floating point format, the
equality of an expression to zero can never be assured.

Example 1.16 The quadratic equation x2 + 2x – 2 = 0 has roots – 1 ± 3.

Express these roots as normalized floating point and verify them.
Solution: The roots are expressed in normalized floating point as 0.7320 E 0 and
–0.2732 E 1

24 Self-Instructional Material
If you substitute x = 0.7320 E 0 in equation x2 + 2x – 2, you obtain: Errors and Floating Point
Arithmetic
(0.7320 E 0)2 + 2(0.7320 E 0) – 2
= (0.7320 E 0 × 0.7320 E 0)
+ (0.2000 E 1 × 0.7320 E 0 – 0.2000 E 1 NOTES
= (0.5358 E 0) + (0.1464 E 1) – (0.2000 E 1)
= (0.0535 E 1 + 0.1464E 1) – (0.2000 E 1)
= (0.1999 E 1 – 0.2000E 1)
= –0.0001 E 1 = 0.1000 E – 2
Ideally, the expression should yield the value zero. Thus, in an algorithm it is
not advisable to write a branching or looping instruction that compares the value
of an expression with zero.

CHECK YOUR PROGRESS

7. What must be done if two numbers represented in normalized floating
point notation are to be added?
8. What happens in the division operation?
9. What is not advisable in algorithms?

1.5 SUMMARY

In this unit, you have learned that:

• There are two types of numbers which are exact numbers such as 2, 3, 4,
10, 15, etc. and approximate numbers such as π(= 3.141592), etc.
• An approximate number is one which represents the exact number to a
certain degree of accuracy. Significant digits give the idea of accuracy of
approximate numbers.
• The significant digits of a number are those digits that carry meaning,
contributing to its precision. Significant digits are used to express a
number.
• The best way to identify the significant digits in a given number is to write
the number in the scientific notation with the first digit being non-zero.
• The first part of the significant digit is known as mantissa and the second
part, i.e.10n is known as exponent.
• The process of dropping the unwanted digits is called rounding off.
• The use of approximate numbers introduces error in numerical calculations.

Self-Instructional Material 25
Errors and Floating Point • Error is defined as the difference between exact and approximate values.
Arithmetic
• Errors that exist in a problem before it is solved are called inherent errors.
These errors occur due to incorrect measurements or observations that
may be due to the limitations of the measuring instrument such as
NOTES
mathematical tables, calculators or the digital computer. These errors can
be minimized by taking better data or by using high precision computing
aids.
• The errors which occur when some digits from the number are discarded
are known as truncation errors.
• Round-off errors occur during the process of rounding-off a number.
Rounding-off errors are unavoidable errors due to the limitations of
computing.
• If two numbers represented in a normalized floating point notation are to be
added, the exponents of the two numbers must be made equal and the
mantissa should be shifted appropriately.

1.6 KEY TERMS

• Significant digits: These are digits that carry meaning, contributing to their
precision.
• Rounding off: It is the process of dropping the unwanted digits.
• Truncation errors: The are errors that occur when some digits from the
number are discarded.
• Normalization: It is the shifting of a decimal point to the left of the most
significant digit.

1.7 ANSWERS TO ‘CHECK YOUR PROGRESS’

1. An approximate number is one that represents the exact number to a certain

degree of accuracy.
2. The significant digits of a number are those digits that carry meaning,
contributing to precision of the number.
3. The best way to identify the significant digits in a given number is to write
the number in the scientific notation with the first digit being non-zero.
4. Errors are introduced in numerical calculations by the use of approximate
numbers.
5. When two numbers are added, the magnitude of absolute error in the result
is the sum of the magnitudes of the absolute errors in the two numbers.

26 Self-Instructional Material
6. The two formats in which real numbers can be written are the fixed-point Errors and Floating Point
Arithmetic
(without exponent) format and the floating point (with exponent) format.
7. If two numbers represented in normalized floating point notation are to be
added, the exponents of the two numbers must be made equal and the
NOTES
mantissa shifted appropriately.
8. In division, the mantissa of the numerator is divided by that of the
denominator.
9. In algorithms it is not advisable to write a branching or looping instruction
that compares the value of an expression with zero.

1.8 QUESTIONS AND EXERCISES

Short-Answer Questions
1. What are the guidelines or rules for rounding off a number to n-significant
digits?
2. Write a short note on truncation errors.
3. Round off the following numbers to four significant digits: 0.00032217,
35.46735 and 18.265101 and compute Ea, Er and Ep in each case.
4. Which standard defines form at for represented floating point format.
5. On what depends the range of a floating point number.
Long-Answer Questions
1. Discuss the different types of errors that occur during numerical
computations.
2. Describe errors in numerical computations.
3. Add the following floating point numbers:
(i) 0.4546 E 5 and 0.5433 E 5
(ii) 0.4546 E 5 and 0.5567 E 7
(iii) 0.4546 E 3 and 0.5567 E 7
(iv) 0.6434 E 3 and 0.4845 E 3
(v) 0.6434 E 99 and 0.4845 E 99.
4. Given a value of x = 3.51 with an error δx = 0.001, estimate the resulting
error in the function y = x3.
5. Find the relative error in the function
m m m
y = ax1 2 x2 2 ... xn n .
6. Explain some drawbacks of floating point representation with the help of an
example.

Self-Instructional Material 27
Errors and Floating Point
Arithmetic 1.9 FURTHER READING

Mott, J.L. Discrete Mathematics for Computer Scientists, 2nd edition. New
NOTES Delhi: Prentice-Hall of India Pvt. Ltd., 2007.
Bonini, C.P., W.H. Hausman H. and Har Bierman. Quantitative Analysis for
Business Decisions. Illinois: Richard D. Irwin, 1986.
Charnes, A., W.W. Cooper, and A. Henderson. An Introduction to Linear
Programming. New York: John Wiley & Sons, 1953.
Gupta, S.C. and V.K. Kapoor. Fundamentals of Mathematical Statistics, 9th
revised edition. New Delhi: Sultan Chand & Sons, 1997.

28 Self-Instructional Material
Interpolation

UNIT 2 INTERPOLATION
Structure NOTES
2.0 Introduction
2.1 Unit Objectives
2.2 Polynomial Interpolation
2.2.1 Finite Differences
2.2.2 Differences of a Polynomial
2.2.3 Some Useful Symbols
2.3 Missing Term Technique
2.3.1 Effect of an Error on a Difference Table
2.4 Newton’s Formulae for Interpolation
2.4.1 Newton–Gregory Forward Interpolation Formula
2.4.2 Newton–Gregory Backward Interpolation Formula
2.5 Central Difference Interpolation Formula
2.5.1 Gauss’s Forward Difference Formula
2.5.2 Gauss’s Backward Difference Formula
2.5.3 Stirling’s Formula
2.5.4 Bessel’s Interpolation Formula
2.5.5 Lagrange’s Interpolation Formula
2.6 Newton’s Divided Difference Interpolation Formula
2.7 Discrete Least Squares Approximation
2.7.1 Fourier Series
2.8 Summary
2.9 Key Terms
2.10 Answers to ‘Check Your Progress’
2.11 Questions and exercises
2.12 Further reading

2.0 INTRODUCTION
The process of estimating or approximating the value of independent variable y for
a given value of x in the specified range say xi ≤ x ≤ xn is known as ‘interpolation’.
If the approximating function φ(x) is a polynomial, the process is known as
‘polynomial interpolation’.
Sometimes, you want to estimate the value of y for any given value of x
which is outside the range xi ≤ x ≤ xn, then the process of approximating y = f(x)
by another function φ(x) is known as extrapolation.
Interpolation is a very important topic in numerical analysis as it provides
the base for numerical differentiation and integration.
The study of interpolation is based on the calculus of finite differences. Finite
differences deal with the variations in a function corresponding to the changes in
the independent variables. These changes occur when independent variable x
takes finite jumps which may be equal or unequal. In this unit, you will study the

Self-Instructional Material 29
Interpolation variations in the function when the independent variable changes by equal intervals.
Since finite differences are the base of interpolation, our discussion will begin with
the definition of finite differences only.
In this unit, you will learn about polynomial interpolation, missing term
NOTES technique, Newton’s formulae for interpolation, Newton’s divided difference in
interpolation formula and discrete least squares approximation.

2.1 UNIT OBJECTIVES

After going through this unit, you will be able to:

• Understand polynomial interpolation
• Explain missing term technique
• Describe Newton’s formulae for interpolation
• Know the central difference interpolation formulae
• Understand Newton’s divided difference interpolation formula
• Explain discrete least squares approximation

2.2 POLYNOMIAL INTERPOLATION

Consider two variables x and y such that y = f(x), then x is an independent,

variable, y is the dependent variable and f is the function from x to y. If the function,
f(x) is known explicitly, then the value of y corresponding to any x can be found
easily whereas if the function f(x) is not known it is very difficult to determine the
exact form of f(x) with the help of the set of values (xi, yi). In such cases the
function f(x) is replaced by a simpler function φ(x) which assumes the same values
at points (xi, yi). This process of estimating or approximating the value of
independent variable y for a given value of x in the specified range say xi ≤ x ≤ xn
is known as interpolation. If the approximating function φ(x) is a polynomial, the
process is known as polynomial interpolation.
Now if you want to calculate the value of independent variable x for a given
value of dependent variable y, the process is known as ‘inverse interpolation’.
Note that the approximating function φ(x) can be of any form such as polynomial,
trigonometric, logarithmic, etc. But here only polynomial interpolation will be
discussed.
Sometimes you want to estimate the value of y for any given value of x
which is outside the range xi ≤ x ≤ xn, then the process of approximating y = f(x)
by another function φ(x) is known as extrapolation.

30 Self-Instructional Material
Interpolation provides the base for numerical differentiation and integration. Interpolation

The study of interpolation is based on the calculus of finite differences. Finite

differences deal with the variations in a function corresponding to the changes in
the independent variables. These changes occur when independent variable
NOTES
x takes finite jumps which may be equal or unequal. Here, you will study the
variations in the function when the independent variable changes by equal intervals.
Since finite differences are the base of interpolation, let the discussion begin with
the definition of finite differences only.
2.2.1 Finite Differences
Let y = f(x) be a function (x, y), where x is the independent and y is the dependent
variable.
Suppose y = f(x) is known for some values of x which are equally spaced
as x0, x0 + h, x0 + 2h, ..., x0 + nh, here h is the small jump that x takes and is
known as step size. The corresponding values of y will be y0, y1, y2, .., yn.
As the name finite differences implies, these are the differences either between
the values of the function or differences between the past differences. There are
four types of differences:
1. Forward difference
2. Backward difference
3. Central difference
4. Divided difference
1. Forward difference: The differences y1 – y0, y2 – y1 ..., yn – yn – 1 are
known as first forward difference. They are denoted respectively as ∆y0,
∆y1, ..., ∆yn – 1, where ∆ is the forward difference operator. Thus, the first
forward difference is
∆yk = yk + 1 − yk
The second forward difference is the difference of the first forward difference.
It is denoted as ∆ 2 y0 , ∆ 2 y1 , ..., ∆ 2 yn and defined respectively as
∆ 2 y0 = ∆y1 − ∆y0 , ∆ 2 y1 = ∆ 2 y2 − ∆ 2 y1

∆ 2 yn −1 = ∆yn − ∆yn −1
and so on.
In general, the rth forward differences are given as

∆ r yk = ∆ r −1 yk +1 − ∆ r −1 yk .
These differences are systematically set out in Table 2.1.

Self-Instructional Material 31
Interpolation Table 2.1 Forward Difference Table

Value Value 1st 2nd 3rd 4th 5th 6th

of x of y Diff. Diff. Diff. Diff. Diff. Diff.
NOTES x0 y0
∆y 0
x0 + h y1 ∆2 y 0
∆y 1 ∆3 y 0
x0 + 2h y2 ∆2 y 1 ∆4 y 0
3
∆y 2 ∆ y1 ∆5 y 0
2 4
x0 + 3h y3 ∆ y2 ∆ y1 ∆6 y 0
∆y 3 ∆3 y 2 ∆5 y 1
2 4
x0 + 4h y4 ∆ y3 ∆ y2
∆y 4 ∆3 y 3
x0 + 5h y5 ∆2 y 4
∆y 5
x0 + 6h y6

The higher order differences can be expressed in terms of y, which are here
known as entries, whereas, x is called the argument in the difference table.
The first value y0 in Table 2.1 is called the leading term and the differences
∆y0 , ∆ 2 y0 , ...,etc. are known as the leading differences.

Thus, ∆ 2 y0 = ∆y1 − ∆y0

= ( y2 − y1 ) − ( y1 − y0 )
= y2 – 2y1 + y0
and ∆ 3 y0 = ∆ 2 y1 − ∆ 2 y0
= (∆y2 − ∆y1 ) − (∆y1 − ∆y0 )
= ∆y2 − 2∆y1 + ∆y0
= ( y3 − y2 ) − 2( y2 − y1 ) + ( y1 + y0 )
= y3 − 3 y2 + 3 y1 − y0
Similarly, ∆ 4 y0 = ∆ 3 y1 − ∆ 3 y0
= ( y4 − 3 y3 + 3 y2 − y1 ) − ( y3 − 3 y2 + 3 y1 − y0 )
= y4 − 4 y3 + 6 y2 − 4 y1 + y0
You can see that the coefficient which are occurring on the right hand side
are the binomial coefficients,

32 Self-Instructional Material
Interpolation
∆ 4 y0 = y4 − 4C1 y3 + 4C2 y2 − 4C3 y1 + 4C4 y0
Thus, in general, you can say that,

∆ k y0 = yk − k C1 yk −1 + k C2 yk − 2 + ... + (−1) k y0 NOTES

In function notation, the forward differences are written as follows:
∆f ( x ) = f(x + h) – f(x)

∆ 2 f ( x) = f(x + 2h) – 2f(x + h) + f(x)

and so on.
Properties of ∆: The forward difference operator ∆ satisfies the following
properties:
(i) ∆[f(x) ± g(x)] = ∆f ( x ) ± ∆g ( x ) , i.e., ∆ is linear..
(ii) ∆ [ af ( x )] = a . ∆f ( x ), a being a constant:

(iii) ∆ m ∆ n f ( x) = ∆ m + n f ( x), where m and n are positive integers.

(iv) ∆[ f ( x ). g ( x )] ≠ ∆f ( x ). g ( x )
2. Backward difference: The differences y1 – y0, y2 – y1, y3 – y2, ..., yn –
yn – 1 when denoted as ∇y1, ∇y2 , ..., ∇yn respectively, are called the first
backward differences. Here, ∇ is called the backward difference operator.
Thus, the first backward differences are:
∇y1 = y1 – y0
∇y2 = y2 – y1
M
∇yn = yn – yn – 1
Now, the second backward differences are obtained by taking the differences
of the first backward differences as:
∇ 2 y2 = ∇y2 − ∇y 1
= ( y2 − y1 ) − ( y1 − y0 )
= y2 − 2 y1 + y0

∇ 2 y2 = y3 − 2 y2 − y1
and so on.

Self-Instructional Material 33
Interpolation Table 2.2 shows the backward difference table:
Table 2.2 Backward Difference Table

Value Value 1st 2nd 3rd 4th 5th

NOTES of x of y Diff. Diff. Diff. Diff. Diff.
x0 y0
∆y 1
x0 + h y1 ∆2 y 2
∆y 2 ∆3 y 3
2
x0 + 2h y2 ∆ y3 ∆4 y 4
3
∆y 3 ∆ y4 ∆5 y 5
x0 + 3h y3 ∆2 y 4 ∆4 y 5
3
∆y 4 ∆ y5
x0 + 4h y4 ∆2 y 5
∆y 5
x0 + 5h y5

Properties of ∇: The backward difference operator ∇ satisfies the following

properties:
(i) ∇[ f ( x ) ± g ( x )] = ∇f ( x ) ± ∇g ( x), i.e., ∇ is linear .
(ii) ∇[a f ( x )] = a . ∇f ( x ); where a is constant .
(iii) ∇ m∇ n f ( x) = ∇ m + n f ( x) .
(iv) ∇[ f ( x ). g ( x )] ≠ ∇f ( x ). ∇g ( x ) .
Note: Any value of y can be expressed in terms of yn and the backward
difference ∇yn , ∇ 2 yn ... etc. You know that
∇yn = yn − yn − 1
Or yn − 1 = yn − ∇yn = (1 − ∇) yn ...(2.1)
Thus yn − 2 = yn − 1 − ∇yn − 1 = (1 − ∇) yn −1 = (1 − ∇) 2 yn
(Using Equation (2.1))
Similarly,
yn − 3 = yn − 2 − ∇yn − 2
= (1 − ∇) yn − 2
= (1 − ∇)2 yn −1
= (1 − ∇ )3 yn
Thus, in general,
yn − k = (1 − ∇ )k yn
Or, yn − k = yn − k C1 ∇yn + k C2 ∇ 2 yn + ... + (−1)k ∇ k yn
(Using binomial expansion)
34 Self-Instructional Material
3. Central difference: Another system of difference is known as central Interpolation
difference denoted by δ and is defined as:
δy 1/2 = y1 – y0, δy3/2 = y2 – y1
δy(n – 1/2) = yn – yn – 1
NOTES
and so on.
Similarly, the higher order difference are obtained as:
δ2 y 1 = δy3/2 − δy1/2 ; δ 2 y2 = δy5/2 − δy3/2

δ3 y3/2 = δ2y2 – δ2y1

and so on.
These differences are shown in Table 2.3.
Table 2.3 Central Difference Table

Value Value 1st 2nd 3rd 4th

of x of y Diff. Diff. Diff. Diff.
x0 y0
δy 1/2
x0 + h y1 δ2y 1
δy 3/2 δ3 y 3/2
2
x0 + 2h y2 δ y2 δ4y 2
3
δy 5/2 δ y 5/2
x0 + 3h y3 δ2y 3
δy 7/2
x0 + 4h y4
It is clear from Table 2.3 that the differences on the same horizontal line
have the same suffix. Also the differences of odd order are known only for
half values of the suffix and those of even order for integral values of the
suffix.
Sometimes, it is required to find the mean µ of adjacent values in the same
column of differences. This is defined as:
1
µ δy1 = (δy1/2 − δy3/2 )
2
1
µ δ 2 y3/2 = (δ 2 y1 + δ 2 y2 )
2
and so on.
4. Divided difference: If the values of function y = f(x) at x1, x2, x3,..., xn are
given as y1, y2, ..., yn, where x1, x2, ..., xn are not equally spaced, then the
difference obtained as,
y2 − y1 y3 − y2 yn − yn −1
, , ...,
x2 − x1 x3 − x2 xn − xn −1
is known as the first divided difference and denoted as ∆| y1, ∆| y2, ..., ∆| yn – 1.

Self-Instructional Material 35
Interpolation The second order divided differences are obtained as:
∆| y2 − ∆| y1 ∆| y3 − ∆| y2
∆| 2 y 1 = ; ∆| 2y2 =
x3 − x1 x4 − x2
NOTES In general,
i ∆| i −1yk +1 − ∆| i −1yk
∆| yk =
xi + k − xk
Table 2.4 represents the divided difference table.

Table 2.4 Divided Difference Table

Value Value 1st 2nd 3rd 4th
of x of y Diff. Diff. Diff. Diff.
x0 y0
∆| y 0
x1 y1 ∆| 2 y 0
∆| y 1 ∆| 3 y 0
2
x2 y2 ∆| y 1 ∆| 4 y 0
3
∆| y 2 ∆| y 1
x3 y3 ∆| 2 y 2
∆| y 3
x4 y4

2.2.2 Differences of a Polynomial

The nth differences of a polynomial of nth degree are constant and all higher order
differences are zero when the values of the independent variable are at equal
interval.
Let f(x) = a0 + a1x + a2 x 2 + ... + an x n
∆f(x) = f(x + h) – f(x)
2 2 n n
= a1[( x + h) − x] + a2 [( x + h) − x ] + ... + an [( x + h) − x ]
= a1h + a2 [ 2 C1 xh + x 2 + h 2 − x 2 ]
+ a3[3 C1x 2 h +3 C2 xh 2 + h3 + x3 ]

+ ... + an [n C1 x n −1h + nC2 x n − 2 h 2 + ... + nCn h n ]

= b1 + b2 x + b3 x 2 + ... + bn −1x n − 2 + nan hx n −1

Again, ∆ 2 f ( x) = ∆f ( x + h) − ∆ f ( x)
= b1 + b2 ( x + h) + b3 ( x + h)2

+ ... + bn −1 ( x + h)n − 2 + nh an ( x + h) n −1

36 Self-Instructional Material
Interpolation
– [b1 + b2 x + b3 x 2 + ... + bn −1x n − 2 + nan hx n −1 ]

= b2 h + b3[2C1hx + h2 ] + ... + bn −1[ n − 2C1x n − 3h + ... + n–2

Cn h n – 2 ]

+ nhan [n −1C1 x n − 2 h + n −1 NOTES

C2 x n − 3h 2 + ... + h n −1 ]
3 n−3
= c2 + c3 x + c4 x + ... + cn −1 x + n(n − 1) h2 an x n − 2
Which is clearly a polynomial of degree (n – 2).
Thus continuing the process

∆ n f ( x) = n(n − 1) (n − 2) ... 3.2.1. h n an x n − n

= n h n an
Thus, nth difference is a constant and so all higher differences are zero, i.e.,
(n + 1)th and higher difference of a polynomial of nth degree are zero.
2.2.3 Some Useful Symbols
(1) Forward shift operator (E): Defined by the equation
Eyr = yr + 1 or Ef(x) = f(x + h)
which shows that the effect of E is to shift the functional value yr to the next higher
value yr + 1.
E 2 yr = E ( Eyr ) = Eyr +1 = yr + 2

Thus, E n y r = yr + n
Now, ∆y0 = y1 − y0 ⇒ Ey0 − y0 = ( E − 1) y0
⇒ ∆ =E–1 or E ≡1+ ∆

Example: ∆2y 0 = ( E − 1)3 y0 = ( E 3 − 3E 2 + 3E − 1) y0

= E 3 y0 − 3E 2 y0 + 3Ey0 − 1
= y3 − 3 y2 + 3 y1 − 1
(2) Mean operator (µ): Defined by equation,
1
( yr + ½ + yr – ½ )
µyr =
2
Relation between operators: (i) E = 1 + ∆
Proof: ∆yx = y x + h − y x
= Ey x − yn = ( E − 1) y x

Self-Instructional Material 37
Interpolation
⇒ ∆ = E – 1 or E = 1 + ∆

Note: Separation of symbols means E − 1 = ∆ .

NOTES (ii) ∇ = 1 – E–1 [E–1 is backward shift operator]
Proof: ∇y x = y x − y x − h

= y x − E −1 y x = (1 − E −1 ) y x ⇒ ∇ ≡ 1 − E −1 .

(iii) δ ≡ E1/2 − E −1/2

δy x = y x + h/2 − y x − h/3 = E1/2 y x − E –1/2 y x = ( E1/2 − E –1/2 ) y x

⇒ δ ≡ E1/2 − E – 1/2
1 1/2
(iv) µ = ( E + E1/2 )
2
1
µyx = ( y x − h/2 + y x − h/2 )
2
1 1/2
⇒ µyx = ( E + E −1/2 ) y x
2
1 1/2
⇒ µ≡ ( E + E −1/2 )
2
(v) ∆ = E∇ = ∇E = δE1/2

Proof: E (∇y x ) = E ( y x − y x − h ) = y x + h − y x = ∆y x

⇒ E∇ ≡ ∆
Similarly,
∇( Ey x ) = ∇( y x + h ) = y x + h − y x = ∆

⇒ ∇E ≡ ∆

δ( E1/2 ( y x )) = δ( y x + h/2 )

= y h h −y h h = yx + h − yx = ∆
x+ + x+ −
2 2 2 2

δE1/2 ≡ ∆

(vi) E ≡ ehD
Ef(x) = f(x + h)

38 Self-Instructional Material
h2
Interpolation

= f ( x) + hf ′( x) + f ′′( x) + ...
2

h2 2
= f ( x) + hD f ( x) + D f ( x) + ... NOTES
2

 h2 D 2 
=  1 + hD +
2
+ ...  f ( x )

 

= e hD f ( x)

⇒ E ≡ e hD or 1 + ∆ = e hD or ∆ = e hD − 1
Example 2.1: Using the method of separation of symbols, show that:
n(n − 1)
∆ nu x − n = u x − nu x −1 + u x − 2 + ... + ( −1) n u x − n
2
n( n − 1)
Solution: R.H.S. u x − nu x −1 + u x − 2 + ... + ( −1) n u x − n
2

 −1 n(n − 1) −2 
= 1 − nE + E + ... + (−1) n E − n  u
 2  x
n n
−1 n  1  E −1 
(
= 1− E ) . ux = 1 −  ux = 
 E  E 
 ux

∆n
= n
u x = ∆ n . E − nu x = ∆ nu x − n .
E
Hence Proved. ( Q E – 1 = ∆)
Example 2.2: Show that:
 x2 2  2
e x  u0 + x ∆u0 + ∆ u0 + ...  = u0 + u1 x + u2 x + ...
 2 
  2

x x2∆ 2 
e
Solution: L.H.S.   1 + x ∆ + + ...  . u0 = e x . e x∆ . u = e x + x∆ . u

 2 
0 0

= e x (1 + ∆ ) . u0 = e xE u0

 x2 E 2 
=  1 + xE +
2
+ ...  u0

 

Self-Instructional Material 39
Interpolation
x2 2
= u0 + xEu0 + E u0 + ...
2
x2
NOTES = u0 + x u1 + u2 + ...
2

Example 2.3: Evaluate the following :

(i) ∆ tan–1 x (ii) ∆2 cos 2x
Where, h is the interval of differenciation.
Solution: (i) ∆ tan −1 x = tan −1 ( x + h) − tan −1 x

 x+h−x  −1  h 
= tan −1   = tan  
2
1 + x ( x + h)   1 + hx + x 
(ii) ∆2 cos 2x = ∆[cos 2(x + h) – cos 2x]
= [cos 2(x + 2h) – cos 2(x + h)] – [cos 2(x + h) – cos 2x]
= – 2 sin (2x + 3h) sin h + 2 sin (2x + h) sin h
= – 2 sin h [2 cos (2x + 2h) sin h] = – 4 sin2 h cos 2(x + h).
Example 2.4: Evaluate:

 5 x + 12 
∆2  2
 x + 5 x + 6 
; the interval of differenciation being unity..
 

 5 x + 12 
Solution: ∆ 2  
 ( x + 2) ( x + 3) 

 2 3    2   3 
= ∆2  +  = ∆ ∆   + ∆ 
 x + 2 x + 3   x + 2  x + 3 

  1 1   1 1 
= ∆ 2  −  + 3 − 
  x+3 x + 2  x + 4 x + 3 

 1   1 
= −2∆   − 3∆  
 ( x + 2) ( x + 3)   ( x + 3) ( x + 4) 

 1 1 
= −2  − 
 ( x + 3) ( x + 4) ( x + 2) ( x + 3) 

 1 1 
−3 − 
 ( x + 4) ( x + 5) ( x + 3) ( x + 4) 
40 Self-Instructional Material
Interpolation
4 6
= +
( x + 2) ( x + 3) ( x + 4) ( x + 3) ( x + 4) ( x + 5)

2(5 x + 16)
= . NOTES
( x + 2) ( x + 3) ( x + 4) ( x + 5)
Example 2.5: Prove that hD = –log (1 – ∇) = sin h–1 (µδ).
Solution: hD = log E = – log (E–1) = – log (1 – ∇) [ Q E–1 = 1 – ∇]
1 1/2
Also, µ= ( E + E −1/2 )
2
δ = E1/2 − E −1/2
1 −1 1 hD − hD
∴ µδ = ( E − E ) = (e − e ) = sin h ( hD )
2 2

or hD = sin h −1 (µ δ) .
Example 2.6: (i) Find the function whose first difference is ex.
(ii) If ∆ 3u x = 0, prove that,

u 1 1
x+
1
= (u x + u x + 1 ) − ( ∆ 2u x + ∆ 2u x + 1 ).
2 2 16

Solution: (i) x+h

∆e x = e − e x = (e h − 1) e x

∆e x
⇒ e =x
eh − 1

 ex  x ex
Hence, ∆  h  = e or f ( x) = .
 e −1  eh − 1
 

u 1
(ii) x+ = E1/2u x = (1 + ∆ )1/2 u x
2

 1 1 2
= 1 + ∆ − ∆  u x [ Q ∆3ux = 0]...(1)
 2 8 

Now, ∆ 3u x = 0

⇒ ∆3u x + 1 − ∆ 2u x = 0

⇒ ∆ 2u x + 1 = ∆ 2u x and ∆u x = u x + 1 − u x

Self-Instructional Material 41
Interpolation ∴ From Equation (1),

1  ∆ 2u x ∆ u x + 1 
2
1
u = u x + (u x + 1 − u x ) −  + 
1 2 8 2 2 
NOTES
x+
2  
1 1
= (u x + u x + 1 ) − ∆ 2u x + ∆ 2 u x + 1 .
( )
2 16
Example 2.7: (i) Find f(6) if f(0) = –3, f(1) = 6, f(2) = 8, f(3) = 12; third
difference being constant.
(ii) Find ∆10 (1 − ax)(1 − bx 2 )(1 − cx3 )(1 − dx 4 ).

(iii) Evaluate ∆ n (ax n + bx n −1 ).

Solution: (i) The difference table is shown below:
Difference Table

x f(x) ∆f(x) ∆2 f(x) ∆3 f(x)

0 –3
9
1 6 –7
2 9
2 8 2
4
3 12

f(0 + 6) = E 6 f (0) = (1 + ∆)6 f (0) = (1 + 6∆ + 15∆ 2 + 20∆ 3 ) f (0)

= – 3 + 6(9) + 15 (–7) + 20(9)
= – 3 + 54 – 105 + 180
= 126.
(ii) Maximum power of x in the polynomial will be 10 and the coefficient of
x10 will be abcd.
Here, k = abcd, h = 1, n = 10
∴ Expression = k hn n! = abcd 10!.
n n −1
(iii) ∆ n (ax n + bx n −1 ) = a ∆ ( x ) + b ∆ ( x ) = a(n)! + b(0) = a(n)!.
n n

Example 2.8: Prove that:

(i) δ [ f ( x ) g ( x )] = µ f ( x ) δg ( x ) + µ g ( x ) δf ( x )
 f ( x)  µ g ( x) δf ( x) − µ f ( x) δg ( x)
(ii) δ  =
 g ( x)   1  1
gx −  gx + 
 2  2
The interval of difference is said to be unity.
42 Self-Instructional Material
Solution: (i) R.H.S. = µf(x) δg(x) + µg(x) δf(x) Interpolation

E1/2 + E – 1/2
= f ( x).( E1/2 − E −1/2 ) g ( x)
2
E1/2 + E – 1/2 NOTES
+ g ( x) ( E1/2 − E −1/2 ) f ( x)
2
1   1  1    1  1 
=  f  x +  + f  x –   g  x +  − g  x − 
2   2  2    2  2 

  1  1    1  1  
+  g  x +  + g  x −   f x +  − f  x −  
  2  2    2  2  

1   1  1  1  1  1  1
=  f  x +  g  x +  − f  x +  g  x −  + f x– gx+ 
2   2  2  2  2  2  2
 1  1  1  1
f x–  gx– + f x+  gx+ 
 2  2  2  2

 1  1  1  1  1  1 
+ f  x +  g  x −  − f  x −  g  x +  − f  x −  g  x − 
 2  2  2  2  2  2 
 1  1  1  1
= f x+  gx+  − f x−  gx − 
 2  2  2  2
= E1/2 f ( x) g ( x) − E −1/2 f ( x) g ( x)
= ( E1/2 − E −1/2 ) f ( x) g ( x) = δf ( x) g ( x).
µ g ( x) δf ( x) − µ f ( x) δg ( x)
(ii) R.H.S. =
 1  1
gx −  gx + 
 2  2
Numerator of R.H.S.
E1/2 + E −1/2
= g ( x) ( E1/2 − E −1/2 ) f ( x)
2
E1/2 + E −1/2
– f ( x) ( E1/2 − E −1/2 ) g ( x)
2
1   1  1    1  1 
=  g  x +  + g  x −   f x+ − f  x − 
2   2  2    2  2 

  1  1    1  1  
− f x+ + f  x −   g  x +  – g  x –  
  2  2    2  2  

Self-Instructional Material 43
Interpolation
1   1  1  1  1  1  1
= f x+ gx+ + f x+ gx− − f x− gx+ 
2   2  2  2  2  2  2

NOTES  1  1  1   1  1  1  1
– f  x –  g  x –  − f x+ gx+  − f x+ gx– 
 2  2  2   2  2  2  2

 1  1  1  1 
+ f x− gx+ − f  x −  g  x − 
 2  2  2  2 

 1  1  1  1
= f x+ gx–  − f x− gx+ 
 2  2  2  2

 1  1  1  1
f x+ gx−  − f x− gx+ 
R.H.S. = 
2  2  2  2
∴
 1   1 
gx− gx+ 
 2  2

 1  1
f x+  f x− 
= 
2
− 
2
 1  1
gx+  gx− 
 2  2

 f ( x)   f ( x) 
= E1/2   − E1/2  g ( x) 
 g ( x)   

1/2 – 1/2  f ( x )   f ( x) 
= (E − E )  =δ .
 g ( x)   g ( x) 

CHECK YOUR PROGRESS

1. What is inverse interpolation?
2. What are first backward differences?
3. Which equation defines shift operator (E)?

2.3 MISSING TERM TECHNIQUE

Suppose n values out of (n + 1) values of y = f(x) are given, the values of x being
equidistant.
Let the unknown value be N.

44 Self-Instructional Material
Since, only n values of y are known, you can assume y = f(x) to be a Interpolation

polynomial of degree (n – 1) in x.
By equating to zero the nth difference, you can get the value of N.
Example 2.9: Express f(x) = x4 – 12x3 + 24x2 – 30x + 9 and its successive NOTES
difference in factorial notation. Hence, show that ∆5f(x) = 0.

Solution: Let, f(x) = A[ x]4 + B[ x]3 + C[ x]2 + D[ x] + E

By using the method of synthetic division, you divide by x, x – 1, x – 2,
x –3, etc. successively, then
1 1 –12 24 –30 9=E
1 –11 13
2 1 –11 13 –17 = D
2 –18
3 1 –9 –5 = C
3
4 1 –6 = B
1 =A

Hence, f(x) = [x]4 – 6[x]3 – 5[x]2 – 17[x] + 9

3 2
∴ ∆f ( x ) = 4[ x] − 18[ x] − 10[ x] − 17
2
∆ 2 f ( x) = 12[ x] − 36[ x] − 10
∆3 f ( x) = 24[x] – 36

∆ 4 f ( x) = 24
and ∆5 f ( x) = 0.
Example 2.10: Using the method of separation of symbols, show that
1 1 1
u0 − u1 + u2 − u3 + ... = u0 − ∆u0 + ∆ 2u0 − ... .
2 4 8

1 1 
2 3
1  1 
Solution: R.H.S. = 1 − ∆ +  ∆  −  ∆  + ... u0
2  2 2  2  

−1
1 1 1 1 
= . u0 = 1 + ∆  u0
2  1  2 2 
1 + ∆ 
 2 

Self-Instructional Material 45
Interpolation
= (2 + ∆ )−1 u0 = (1 + E )−1 u0
= (1 – E + E2 – E3 + ...) u0
= u0 − u1 + u2 − u3 + ... = L.H.S.
NOTES
Example 2.11: Use the method of separation of symbols to prove the following
identities:

(i) u x + xC1 ∆ 2u x −1 + x C2 ∆ 4u x − 2 + ... = u0 + xC1 ∆u1 + xC2 ∆ 2u2 + ...

(ii) u x + n = un + xC1 ∆un −1 + x +1C2 ∆un − 2 + x+2

C3 ∆3un − 3 + ...
(iii) u0 + u1 + u2 + ... + un
n +1 n +1 n +1
= C1 u0 + C2 ∆u0 + C3 ∆ 2u0 + ... + ∆ nu0 .

Solution: (i) L.H.S. = (1 + xC1 ∆ 2 E −1 + x C2 ∆ 4 E −2 + ...) u x

x x
2  E + ∆2 
−1 x  E2 − E +1 
= (1 + ∆ E ) u x =   u x =   u x
 E   E 
1
= x
[1 + E ( E − 1)]x u x
E
= E − x (1 + ∆E ) x u x = (1 + ∆E ) x u0

= (1 + x C1 ∆E + x C2 ∆ 2 E 2 + ...) u0
x x 2
= u + C1 ∆u1 + C2 ∆ u2 + ... = R.H.S.
−1 x +1
(ii)
x
R.H.S. = un + C1 ∆E un + C2 ∆ 2 E −2u2
x+2
+ C3 ∆ 3 E −3un + ...
−1 x +1
x
= (1 + C1 ∆E + C2 ∆ 2 E −2 + ....) un

= (1 − ∆E −1 )− x un
−x −x
 ∆  E−∆
= 1 −  un =   un
 E  E 
−x
1
=  un = E x un = un + x = L.H.S.
E

(iii) L.H.S. = u0 + Eu0 + E 2 u0 + ... + E n u0

= (1 + E + E 2 + ... + E n ) u0
46 Self-Instructional Material
 E n +1 − 1   (1 + ∆ ) n +1 − 1 
Interpolation

=  E − 1  0  u =  u0
  
 ∆ 
1
=  (1 + n +1 n +1
C2 ∆ 2 + n +1
C3 ∆ 3 + ... + ∆ n + 1 ) – 1 u0 NOTES
∆
C1∆ +

n +1 n +1 n +1
= C1 u0 + C 2 ∆u 0 + C3 ∆ 2u0 + ... ∆ nu0 = R.H.S.
2.3.1 Effect of an Error on a Difference Table
Suppose there is an error ε in the entry of yS of a table. As higher differences
are formed this error spreads out and is considerably magnified, as shown in
Table 2.5.
Table 2.5 Effect of Error on a Difference Table

x y ∆y ∆ 2y ∆ 3y ∆ 4y
x0 y0
∆y 0
x1 y1 ∆2 y 0
∆y 1 ∆3 y 0
2
x2 y2 ∆ y1 ∆4 y 0
3
∆y 2 ∆ y1
x3 y3 ∆2 y 2 ∆4y1 + ε
3
∆y 3 ∆ y2 – ε
x4 y4 ∆2y3 + ε ∆4y2 – 4ε
∆y4 + ε ∆3y3 – ε
x5 y5 + ε ∆2 y4 –2ε ∆4y3 + 6ε
3
∆y5 – ε ∆ y4 + 3ε
2
x6 y6 ∆ y5 + ε ∆4y4 – 4ε
3
∆y 6 ∆ y5 – ε
x7 y7 ∆2 y 6 ∆4y5 + ε
3
∆y 7 ∆ y6
x8 y8 ∆2 y 7
∆y 8
x9 y9

From Table 2.5 it can be seen that:

(i) This error increases with the order of difference.
(ii) The coefficients of ε’s in any column are the binomial coefficients of (1 –
ε) , i.e., in ∆2 column.
n

2 2 2 2
(1 – ε)2 = 1 − C1 ε + C1 ε = 1 − 2ε + ε
4 4 2 4 3 4 4
Similarly, (1 – ε)4 = 1 − C1 ε + C2 ε – C3 ε + C4 ε
2 3 4
= 1 − 4ε + 6ε − 4ε + ε .

Self-Instructional Material 47
Interpolation (iii) The algebraic sum of all the errors in any column is zero.
(iv) The maximum error in each column occurs opposite to the entry
containing the error, i.e., here opposite to y5.
NOTES Example. 2.12: Find the error and correct the wrong figure in the following
functional values.
2, 5, 10, 18, 26, 37, 50
Solution: Following table shows the functional values which are as follows:
Table 2.6 Table for the Functional Values 2, 5, 10, 18, 26, 36, 50

x y ∆y ∆2 y ∆3 y ∆4 y ∆5 y ∆6 y
1 2
3
2 5 2
5 1
3 10 3 –4
8 –3 10
4 18 0 6 –20
8 3 –10
5 26 3 –4
11 –1
6 37 2
13
7 50
Sum of all the third differences is zero.
Let ε be the error. Now the two adjacent values are equal in magnitude,
i.e., – 3 and 3. The coefficients in this column = Binomial coefficient in the expression
of (1 – ε)3 = 1 – 3ε + 3ε2 – ε3
⇒ –3ε = –3 ⇒ ε =1
⇒ Error lies in the value of y4, i.e., 18.
So corrected value is 18 – 1 = 17
Example. 2.13: If f(x) = y is a polynomial of degree 3 and following gives the
values of x and y then locate and correct the error.
Values of x and y

x: 0 1 2 3 4 5 6
y: 4 10 30 75 160 294 490

48 Self-Instructional Material
Solution: Following table represents the correction of errors: Interpolation
Table 2.12 Correction of Errors in the Table

x y ∆y ∆2 y ∆3 y
0 4 NOTES
6
1 10 14
20 11+ (1)
2 30 25
45 15 – 3(1)
3 75 40 Sum = 11 + 15 + 9 + 13
+ε 85 9 + 3(1) = 20 + 15 + 13
4 160 49 = 48/4 = 12
134 13 – (1)
5 294 62
196
6 490
Since, the polynomial is of degree 3, ∆3y should be a constant and the error
corresponds to the coefficient.
So, the entry corresponding to x = 3 is incorrect, it should be 75 – 1 = 74.

2.4 NEWTON’S FORMULAE FOR INTERPOLATION

Newton’s formula is used for constructing the interpolation polynomial. It makes

use of divided differences. This result was first discovered by the Scottish
mathematician James Gregory (1638–1675), a contemporary of Newton.
Gregory and Newton did extensive work on methods of interpolation but
now the formula is referred to as Newton’s interpolation formula. Newton derived
a general forward and backward difference interpolation formulae.
2.4.1 Newton–Gregory Forward Interpolation Formula
Let y = f(x) be a function of x which assumes the values f(a), f(a + h), f(a + 2h),
..., f(a + nh) for (n + 1) equidistant values a, a + h, a + 2h, ..., a + nh of the
independent variable x. Let f(x) be a polynomial of nth degree.
Let f(x) = A0 + A1 ( x − a ) + A2 ( x − a ) ( x − a − h)
+ A3 (x – a) (x – a – h) (x – a – 2h) + ...
+ An ( x − a) ... ( x − a − n − 1 h) (2.2)

where A0 , A1 , A2 , ..., An are to be determined.

Put x = a, a + h, a + 2h, ..., a + nh in Equation (2.2) successively.
For x = a, f(a) = A0 (2.3)

Self-Instructional Material 49
Interpolation
For x = a + h, f(a + h) = A0 + A1h

⇒ f(a + h) = f(a) + A1h | By (2.3)

NOTES ∆f ( a )
⇒ A1 = (2.4)
h
For x = a + 2h,
f(a + 2h) = A0 + A1 (2h) + A2 (2h) h

= f (a ) + 2h 
∆f ( a )  2
 + 2h A2
 h 
⇒ 2h 2 A2 = f (a + 2h) − 2 f (a + h) + f (a) = ∆ 2 f (a)
∆ 2 f (a)
⇒ A2 =
2! h2
∆3 f (a )
Similarly, A3 = and so on.
3! h3
∆ n f (a)
Thus, An = .
n! hn
From Equation (2.2),
∆ f (a) ∆ 2 f (a)
f(x) = f (a) + ( x − a) h + ( x − a) ( x − a − h) + ...
2! h 2
∆ n f (a)
+( x − a) ... ( x − a − n − 1 h)
n! hn

x−a
Put, x = a + hk ⇒ k= , you will have
h
∆ f (a) (hk ) (hk − h) 2
f(a + hk) = f (a) + hk + ∆ f (a) + ...
h 2! h 2

( hk ) ( hk − h) ( hk − 2h) ... ( hk − n − 1 h)
+ n
∆ n f (a)
n! h

k (k − 1) 2
⇒ f (a + hk ) = f (a) + k ∆ f (a) + ∆ f (a) + ...
2!
k (k − 1) (k − 2) ... (k − n + 1) n
+ ∆ f (a)
n!
this is the required formula.

50 Self-Instructional Material
This formula is particularly useful for interpolating the values of f(x) near the Interpolation

beginning of the set of values given. h is called the interval of difference, while ∆ is
the forward difference operator.
Example.2.14: From the following table, estimate the number of students who
NOTES
weight between 45 and 50.

Weight (in kg ) 35 – 45 45 – 55 55 – 65 65 – 75 75 –85

No. of Students 20 45 35 12 10

Solution: The cumulative frequency table is shown below:

Weight less than (x) (in kg): 45 55 65 75 85
No. of Students (yx ): 20 65 100 112 122
Thus, the difference table is shown below:
Difference Table for the Number of Students Who weigh between 45–50 kgs

x yx ∆y ∆2 y ∆3 y ∆4 y
45 20
45
55 65 –10
35 –13
65 100 –23 34
12 21
75 112 –2
10
85 122
By taking x0 = 45, you shall find the number of students with weight less
than 50.
50 − 45
So, k = = 0.5
10
Using Newton’s Forward Interpolation formula, you get
k ( k − 1) 2
y50 = y45 + k ∆y45 + ∆ y45 + ...
2!

0.5 (0.5 − 1)
∴ y50 = 20 + 0.5 × 45 + × ( −10)
2!

0.5 (0.5 − 1) (0.5 − 2) 0.5 (0.5 − 1) (0.5 − 2) (0.5 − 3)

+ × ( −13) + × (34)
3! 4!
= 20 + 22.5 + 1.25 – 0.8125 – 1.32813
= 41.60937 ≈ 42 students.

Self-Instructional Material 51
Interpolation But the number of students with weight less then 45 is 20.
Hence, the number of students with weight between 45 kg and 50 kg are
42 – 20 = 22.
NOTES Example 2.15: Using Newton’s Forward interpolation formula obtain a polynomial
in x which takes the following values.
x: 4 6 8 10
y: 1 3 8 16

Hence, evaluate y for x = 5.

Solution: Newton’s forward formula is:
k ( k − 1) 2 k ( k − 1) ( k − 2) 3
y(x) = y0 + k ∆y0 + ∆ y0 + ∆ y0 + ...
2! 3!
x − x0
Where k= .
h
The difference table for Newton’s Forward interpolation formula is as shown
in table given below:
x y ∆y ∆2 y ∆3 y
4 1
2
6 3 3
5 0
8 8 3
8
10 16

x−4
Here, x0 = 4, K=
2
 x−4 x−4 
  − 1
 x−4  2  2  × (3) + 0
So, y(x) = 1 +  × 2 +
 2  2!
( x − 4) ( x − 6)
= 1 + (x – 4) + ×3
2
 ( x − 6).3 
= 1 + (x – 4) 1 + 
 2
x−4 1
=1+ [3 x − 16] = 1 + (3 x 2 − 28 x + 64)
2 2
3 2
y(x) = x − 14 x + 33
2
52 Self-Instructional Material
Interpolation
3
So, y(5) = (5) 2 − 14(5) + 33 = 0.5
2
2.4.2 Newton–Gregory Backward Interpolation Formula
NOTES
Let y = f(x) be a function of x which assumes the value f(a), f(a + h), f(a + 2h),
..., f(a + nh) for (n + 1) equidistant values a, a + h, a + 2h, ..., a + nh of the
independent variable x.
Let f(x) be a polynomial of the nth degree.

Let, f(x) = A0 + A1 ( x − a − nh) + A2 ( x − a − nh)( x − a − n − 1 h) + ...

+ An ( x − a − nh) ( x − a − n − 1 h) + ... ( x − a − h)

Where A0 , A1, A2 , A3 , ... An are to be determined. (2.5)

Put, x = a + hn, a + n − 1 h, ..., a in Equation (2.5) respectively..

Put, x = a + nh, then f(a + nh) = A0 (2.6)
Put, x = a + (n – 1) h, then,

f ( a + n − 1 h) = A0 − h A1 = f (a + nh) − h A1 | By Equation (2.6)

∇f ( a + nh)
⇒ A1 = (2.7)
h
Put, x = a + (n – 2)h, then,

f (a + n − 2 h) = A0 − 2hA1 + (−2h) (−h) A2

⇒ 2! h2 A2 = f (a + n − 2 h) – f(a + nh) + 2∇f(a + nh)

= ∇ 2 f (a + nh)

∇ 2 f (a + nh)
A2 = (2.8)
2! h 2
Proceeding, you get,
∇ n f (a + nh)
An = (2.9)
n ! hn
Substituting the values in Equation (2.6), you get,
∇f ( a + nh)
f(x) = f ( a + nh) + ( x − a − nh) + ...
h

∇ n f (a + nh)
+ ( x − a − nh) ( x − a − n − 1 h) ... (2.10)
n! hn Self-Instructional Material 53
Interpolation Put x = a + nh + kh, then,
x – a – nh = kh
and x – a – (n – 1)h = (k + 1)h
NOTES M
x – a – h = ( k + n − 1) h
∴ Equation (2.10) becomes,
k ( k + 1) 2
f(x) = f ( a + nh) + k ∇f ( a + nh) + ∇ f ( a + nh)
2!

∇ n f (a + nh)
+ ... + kh .(k + 1)h ... (k + n − 1) (h)
n ! hn
Which is the required formula.

Or k (k + 1) 2
f ( xn + kh) = f ( xn ) + k ∇f ( xn ) + ∇ f ( xn )
2!
k (k + 1)(k + 2) ... (k + n − 1) n
+ ... + ∇ f ( xn )
n!
This formula is useful when the value of f(x) is required near the end of the
table.
Where xn = (x0 + nh) and a = x0, so f(a + nh) = f(xn).
Example 2.16: Using Newton’s backward interpolation formula, obtain the value
of tan 22°, given that:
θ° 0 4 8 12 16 20 24
tan θ : 0 0.0699 0.1405 0.2126 0.2867 0.3640 0.4452

Solution: The difference table is shown below:

Difference Table for Obtaining the Values of tan 22°

x 104f(x) = y ∆y ∆2 y ∆3 y ∆4 y ∆5 y
0 0
699
7
4 699 8
706 –3
8 1405 15 10
721 5
12 2126 20 7
741 12 –12
16 2867 32 –5
773 7
20 3640 39
812
24 4452
54 Self-Instructional Material
Here, x n = 24, x = 22 Interpolation

x − xn
and k = = – 0.5.
h
Thus, by Newton’s backward interpolation formula, you have NOTES
k ( k + 1) 2 k ( k + 1) ( k + 12) 3
y(x) = yn + k ∇yn + ∇ yn + ∇ yn + ...
2! 3!
y(22) = 4452 + (–0.5) × 812
( −0.5).(0.5) ( −0.5).(0.5).(1.5)
+ × 39 + × (7)
2! 3!

( −0.5).(0.5).(1.5).(2.5) ( −0.5).(0.5).(1.5).(2.5) × (3.5)

+ ×(–5) × (–12)
4! 5!
= 4452 – 406 – 4.875 – 0.4375 + 0.1953 + 0.32813
4
10 y(22) = 4041.21
Thus, tan 22° = 0.4041.

2.5 CENTRAL DIFFERENCE INTERPOLATION

FORMULAE

You shall now study the central difference formulae most suited for Interpolation
near the middle of a tabulated set.
2.5.1 Gauss’s Forward Difference Formula
Newton–Gregory forward difference formula is,
k ( k − 1) 2 k ( k − 1) ( k − 2) 2
f(a + hk) = f(a) + k∆f(a) + ∆ f (a) + ∆ f (a)
2! 3!

k ( k − 1) ( k − 2) ( k − 3) 4
+ ∆ f ( a ) + ... (2.11)
4!
Given a = 0, h = 1, you get
k ( k − 1) 2 k ( k − 1) ( k − 2) 3
f(k) = f(0) + k∆f(0) + ∆ f (0) + ∆ f (0)
2! 3!

k ( k − 1) ( k − 2) ( k − 3) 4
+ ∆ f (0) + ... (2.12)
4!
Now,

∆3 f (−1) = ∆ 2 f (0) − ∆ 2 f (−1) ⇒ ∆ 2 f (0) = ∆3 f (−1) + ∆ 2 f (−1)

Self-Instructional Material 55
Interpolation Also,

∆ 4 f (−1) = ∆3 f (0) − ∆3 f (−1) ⇒ ∆3 f (0) = ∆ 4 f (−1) + ∆3 f (−1)

NOTES and ∆5 f (−1) = ∆ 4 f (0) − ∆ 4 f (−1) ⇒ ∆ 4 f (0) = ∆5 f (−1) + ∆ 4 f (−1) and

so on.
∴ From Eqation (2.12),
k ( k − 1) 2
f(k) = f(0) + k ∆f (0) + {∆ f ( −1) + ∆ 3 f ( −1)}
2!

k ( k − 1) (k − 2) 3
+ {∆ f ( −1) + ∆ 4 f ( −1)}
3!

k ( k − 1) (k − 2) ( k − 3) 4
+ {∆ f ( −1) + ∆5 f ( −1)} + ...
4!

k ( k − 1) 2 k ( k − 1)  k − 2 3
= f (0) + k ∆f (0) + ∆ f ( −1) + 1 +  ∆ f ( −1)
2! 2  3 

k ( k –1) ( k − 2)  k − 3 4 k ( k − 1) ( k − 2) ( k − 3) 5
+ 1 +  ∆ f ( −1) + ∆ f (−1) + ...
6  4  4!

k (k − 1) 2 ( k + 1) k ( k − 1) 3
= f (0) + k ∆ (0) + ∆ f ( −1) + ∆ f (−1)
2! 3!
(k + 1) k (k − 1) ( k − 2) 4 k (k − 1) (k − 2) (k − 3) 5
+ ∆ f (−1) + ∆ f (−1) + ...
4! 4!
(2.13)

But, ∆5 f (−2) = ∆ 4 f (−1) − ∆ 4 f (−2)

∴ ∆ 4 f (−1) = ∆ 4 f (−2) − ∆5 f (−2)

Then Equation (2.13) becomes,
k ( k − 1) 2 ( k + 1) k ( k − 1) 3
f(u) = f (0) + k ∆f (0) + ∆ f ( −1) + ∆ f ( −1)
2! 3!
( k + 1) k ( k − 1) ( k − 2) 4
+ {∆ f ( −2) + ∆ 5 f ( −2)}
4!
k ( k − 1) (k − 2) ( k − 3) 5
+ ∆ f ( −1) + ...
4!

56 Self-Instructional Material
Interpolation
k (k − 1) 2 (k + 1) k (k − 1) 3
f (u ) = f (0) + k ∆f (0) + ∆ f (−1) + ∆ f (−1)
2! 3!
k (k + 1) (k − 1) (k − 2) 4
+ ∆ f (−2) + ...
4! NOTES

This is called Gauss’ forward difference formula.

1
Note: This formula is applicable when a lies between 0 and .
2
Example 2.17: Find the value of f(41) using Gauss’ forward formula from the
following data:

x: 30 35 40 45 50
f ( x) : 3678.2 2995.1 2400.1 1876.2 1416.3

Solution: Gauss’ forward formula is,

k ( k − 1) 2 ( k + 1) k ( k − 1) 3
f(x) = y0 + k ∆y0 + ∆ y −1 + ∆ y−1
2! 3!

( k + 1) k ( k − 1) ( k − 2) 4
+ ∆ y−2 + ...
4!
The central difference table is shown below:
Central Difference Table

x k f(x) ∆y ∆2 y ∆3 y ∆4 y
30 –2 3678.2
–683.1
35 –1 2995.1 88.1
–595 –17
40 0 2400.1 71.1 9.9
–523.9 –7.1
45 1 1876.2 64
–459.9
50 2 1416.3
Thus,
41 − 40 1
Value of k = = = 0.2
45 − 40 5

0.2 (0.2 − 1)
Now, f(41) = 2400.1 + (0.2) × (–523.9) + × (71.1)
2!

Self-Instructional Material 57
Interpolation
(0.2 + 1) (0.2) (0.2 − 1) (0.2 + 1) (0.2) (0.2 − 1) (0.2 − 2)
+ × ( −7.1) + × (9.9)
3! 4!
= 2400.1 – 104.78 – 5.688 + 0.2272 + 0.14256
NOTES = 2290.001760.
2.5.2 Gauss’s Backward Difference Formula
Newton–Gregory forward difference formula is,
k ( k − 1) 2
f(a + hk) = f ( a ) + k ∆f ( a ) + ∆ f (a)
2!

k ( k − 1) ( k − 2) 3
+ ∆ f ( a ) + ... (2.14)
3!
Putting a = 0, h = 1, you get
k ( k − 1) 2 k ( k − 1) ( k − 2) 3
f(k) = f (0) + k ∆f (0) + ∆ f (0) + ∆ f (0)
2! 3!

k ( k − 1) ( k − 2) ( k − 3) 4
+ ∆ f (0) + ... (2.15)
4!
Now, ∆f (0) = ∆f (−1) + ∆ 2 f (−1)

∆ 2 f (0) = ∆ 2 f (−1) + ∆3 f (−1)

∆3 f (0) = ∆3 f (−1) + ∆ 4 f (−1)

∆ 4 f (0) = ∆ 4 f (−1) + ∆5 f (−1) and so on.

∴ From Equation (2.15),
2 k ( k − 1) 2
f(k) = f (0) + k [ ∆f ( −1) + ∆ f ( −1)] + [ ∆ f ( −1) + ∆ 3 f ( −1)]
2!
k ( k − 1) ( k − 2) 3
+ [ ∆ f ( −1) + ∆ 4 f ( −1)]
3!
k ( k − 1) ( k − 2) ( k − 3) 4
+ [ ∆ f ( −1) + ∆ 5 f ( −1)] + ... (2.16)
4!

 k −1  2
= f (0) + k ∆f ( −1) + k 1 +  ∆ f ( −1)
 2 

k ( k − 1)  k −2 3
+ 1 +  ∆ f ( −1)
2  3 

58 Self-Instructional Material
Interpolation
k ( k − 1) ( k − 2)  k − 3 4
+ 1 +  ∆ f ( −1)
6  4 

k ( k − 1) ( k − 2) ( k − 3) 5
+ ∆ f ( −1) + ... NOTES
4!
(k + 1) k 2
= f (0) + k ∆f ( −1) + ∆ f ( −1)
2!
k ( k + 1) ( k − 1) 3
+ ∆ f ( −1)
3!
( k + 1) k ( k − 1) ( k − 2) 4
+ ∆ f ( −1) + ... (2.17)
4!

Again, ∆3 f (−1) = ∆3 f (−2) + ∆ 4 f (−2)

4 5
and ∆ 4 f (−1) = ∆ f (−2) + ∆ f (−2) and so on
∴ Equation (2.17) gives
( k + 1) k 2
f(k) = f (0) + k ∆f ( −1) + ∆ f ( −1)
2!

( k + 1) k ( k − 1)
+ {∆3 f (−2) + ∆ 4 f (−2)}
3!

( k + 1) k ( k − 1) ( k − 2) 4
+ {∆ f ( −2) + ∆ 5 f ( −2)} + ...
4!
Thus, Gauss’ backward formula is,
k ( k + 1) 2 ( k + 1) k ( k − 1) 3
y = y0 + k ∆y−1 + ∆ y −1 + ∆ y −2
2! 3!

( k + 2) ( k + 1) k ( k − 1) 4
+ ∆ y−2
4!
( k + 2) ( k + 1) k ( k − 1) ( k − 2) 5
+ ∆ y−2 + ...
5!

Example 2.18: Apply Gauss’s backward formula to compute sin 45° from the
following table:

θ0 : 20 30 40 50 60 70 80
sin θ : 0.34202 0.502 0.6479 0.76604 0.86603 0.93969 0.98481

Self-Instructional Material 59
Interpolation Solution: The difference table is shown below:
Difference Table to Compute sin 45°

x k y = sin θ ∆y ∆2y ∆3y ∆4y ∆5y ∆6y

NOTES
20 –2 0.34202
0.15998
30 –1 0.502 –0.01919
0.14079 0.00165
40 0 0.64279 –0.01754 –0.0074
0.12325 –0.00572 0.01002
50 1 0.76604 –0.02326 0.00265 –0.01181
0.09999 –0.00307 –0.00179
60 2 0.86603 –0.02633 0.00086
0.07366 –0.00221
70 3 0.93969 –0.02854
0.04512
80 4 0.98481

Here, x 0 = 40, x = 45
45 − 40
and k = = 0.5.
10
Thus, by Gauss’ backward formula, you will have
k ( k + 1) 2 ( k + 1) k ( k − 1) 3
y(x) = y0 + k ∆y−1 + ∆ y −1 + ∆ y −2
2! 3!
( k + 2) ( k + 1) k ( k − 1) 4 ( k + 2) ( k + 1) k ( k − 1) ( k − 2) 5
+ ∆ y −2 + ∆ y−3 + ...
4! 5!
you have,
1.5 × 0.5
y(45) = 0.64279 + 0.5 × 0.14079 + × (–0.01754)
2!
1.5 × 0.5 ( −0.5) (2.5) (1.5) (0.5) ( −0.5)
+ × (0.00165) + × ( −0.0074)
3! 4!
= 0.64279 + 0.070395 + (–0.0065775) – (0.000103125)
– 0.00028789
= 0.70679.
2.5.3 Stirling’s Formula
Gauss’ forward and backward formula are used to derive Stirling formula.
Gauss’ forward formula is,
k ( k − 1) 2 ( k + 1) k ( k − 1) 3
f(k) = f (0) + k ∆f (0) + ∆ f ( −1) + ∆ f ( −1)
2! 3!
( k + 1) k ( k − 1) ( k − 2) 4
+ ∆ f ( −2) + ... (2.18)
60 Self-Instructional Material
4!
Gauss’ backward formula is, Interpolation

( k + 1) 2 ( k + 1) k ( k − 1) 3
f(k) = f (0) + k ∆ ( f − 1) + ∆ f ( −1) + ∆ f ( −2)
2! 3!
( k + 2) k ( k + 1) ( k − 1) 4 NOTES
+ ∆ f ( −2) + ... (2.19)
4!
Take the mean of Equations (2.18) and (2.19),

∆f (0) + ∆f ( −1) k2 2
f ( x ) = f (0) + + ∆ f ( −1)
2 2!
(k + 1) k (k − 1) ∆ 3 f ( −1) + ∆ 3 f ( −2)
+
3! 2
k 2 ( k 2 − 1) 4
+ ∆ f ( −2) + ... (2.20)
4!

1 1 1
This is called Stirling’s formula. It is useful when | k | < or − < k < . It
2 2 2
1 1
gives the best estimate when – <k< .
4 4
Example 2.19: Use Stirling’s formula to evaluate f(1.22) given,

x: 1.0 1.1 1.2 1.3 1.4

f ( x) : 0.841 0.891 0.932 0.963 0.985

Solution: The difference table to evaluate f(1.22) is as shown in table is given

below:
x k y ∆y ∆2 y ∆3 y ∆4 y
1.0 –2 0.841
0.05 –0.009
1.1 –1 0.891 –0.001
0.041 0.002
1.2 0 0.932 –0.01
0.031 0.001
1.3 1 0.963 –0.009
0.022
1.4 2 0.985

x − 1.2
Taking x0 = 1.2 and h = 0.2 p =
0.2

Self-Instructional Material 61
Interpolation Using Stirling’s formula,
2
 0.031 + 0.041  (0.2)
y(1.22) = 0.932 + (0.2) ×   + × (−0.01)
 2 2!
NOTES
(0.2) [(0.2)2 − 12 ]  0.001 − 0.001  (0.2) 2 [(0.2)2 − 12 ]
+ ×  + × (0.002)
3!  2 4!
= 0.932 + 0.0072 – 0.0002 – 0.0000032
= 0.9389 (approx.)
2.5.4 Bessel’s Interpolation Formula
Gauss’ forward formula is,
k ( k − 1) 2 ( k + 1) k ( k − 1) 3
f(k) = f (0) + k ∆ (0) + ∆ f ( −1) + ∆ f ( −1)
2! 3!

( k + 1) k ( k − 1) ( k − 2) 4
+ ∆ f ( −2) ... (2.21)
4!
Gauss’s backward formula is,
k (k + 1) 2 ( k + 1) k ( k − 1) 3
f(k) = f (0) + k ∆f (−1) + ∆ f (−1) + ∆ f (−2)
2! 3!

( k + 2) k ( k + 1) ( k − 1) 4
+ ∆ f ( −2) + ... (2.22)
4!
In Equation (2.22), shift the origin to 1 by replacing k by k – 1 and adding
1 to each argument 0, –1, –2, ..., you get
k ( k − 1) 2
f(k) = f (1) + ( k − 1) ∆f (0) + ∆ f (0)
2!

k ( k − 1) ( k − 2) 3
+ ∆ f ( −1)
3!

( k + 1) k ( k − 1) ( k − 2) 4
+ ∆ f ( −1) + ... (2.23)
4!
By taking mean of Equations (2.21) and (2.23), you get

 f (0) + f (1)   k + ( k − 1) 
f(k) =  +  ∆f (0)
 2   2 

k ( k − 1)  ∆ 2 f ( −1) + ∆ 2 f (0) 
+  
2!  2 
62 Self-Instructional Material
Interpolation
k ( k − 1) ∆ 3 f ( −1)
+ ( k + 1 + k − 2)
3! 2

( k + 1) k ( k − 1) ( k − 2)  ∆ 4 f ( −2) + ∆ 4 f ( −1)  NOTES

+   + ...
4!  2 
Finally, you get

 f (0) + f (1)   1
f (k ) =   +  k −  ∆f (0)
 2   2
k (k − 1)  ∆ 2 f (−1) + ∆ 2 f (0) 
+  
2!  2 
 1
(k − 1)  k −  u
+  2
∆3 f (−1) (2.24)
3!
(k + 1) k (k − 1) (k − 2)  ∆ 4 f (−2) + ∆ 4 f (−1) 
+   + ...
4!  2 

Example 2.20: Using Bessel’s formula obtain y26. Given that y20 = 2854, y24 =
3162, y28 = 3544 and y32 = 3992.
x − 24
Solution: With x0 = 24 and k = the central difference table is shown
4
below:
Central Difference Table for Obtaining y26

x k y ∆y ∆2 y ∆3 y
20 –1 2854
308
24 0 3162 74
382 –8
28 1 3544 66
448
32 2 3992
Using Bessel’s formula,
0.5 (0.5 − 1)  74 + 66 
y26 = 3162 + 0.5 × 382 + ×  + 0 × (–8)
2!  2 
= 3162 + 191 + (–8.75)
= 3344.25.

Self-Instructional Material 63
Interpolation 2.5.5 Lagrange’s Interpolation Formula
Let f ( x0 ), f ( x1 ) ,..., f ( xn ) be (n + 1) entries of a function y = f(x), where
f(x) is assumed to be a polynomial corresponding to the arguments x0, x1, x2,
NOTES ..., xn .
The polynomial f(x) may be written as,
f(x) = A0 ( x − x1 ) ( x − x2 ) ... ( x − xn )

+ A1 ( x − x0 ) ( x − x2 ) ... ( x − xn )

+ ... + An ( x − x0 ) ( x − x1 ) ... ( x − xn − 1 ) (2.25)

where, A0, A1, ..., An are constants to be determined.

By putting x = x0 , x1 , ..., xn in Eqation (2.25), you will get

f(x0) = A0 ( x0 − x1 ) ( x0 − x2 ) ... ( x0 − xn )
f ( x0 )
∴ A0 = ...(2.26)
( x0 − x1 ) ( x0 − x2 ) ... ( x0 − xn )

f(x1) = A1 ( x1 − x0 ) ( x1 − x2 ) ... ( x1 − xn )
f ( x1 )
∴ A1 = (2.27)
( x1 − x0 ) ( x1 − x2 ) ... ( x1 − xn )

M M M
f ( xn )
Similarly, An = (2.28)
( xn − x0 ) ( xn − x1 ) ... ( xn − xn −1 )

Substituting the values of A0, A1, ..., An in Equation (2.25), you will get,

( x − x1 ) ( x − x2 ) ... ( x − xn )
f ( x) = f ( x0 )
( x0 − x1 ) ( x0 − x2 ) ... ( x0 − xn )
( x − x0 ) ( x − x2 ) ... ( x − xn )
+ f ( x1 ) ..(2.29)
( x1 − x0 ) ( x1 − x2 ) ... ( x1 − xn )
( x − x0 ) ( x − x1 ) ... ( x − xn −1 )
+ ... + f ( xn )
( xn − x0 ) ( xn − x1 ) ... ( xn − xn −1 )

This is called Lagrange’s Interpolation Formula. In Equation (2.29), dividing

both sides by (x – x0) (x – x1) ... (x – xn), Lagrange’s formula may also be written
as,

64 Self-Instructional Material
Interpolation
f ( x) f ( x0 ) 1
= .
( x − x0 ) ( x − x1 ) ... ( x − xn ) ( x0 − x1 ) ( x0 − x2 ) ... ( x0 − xn ) ( x − x0 )
f ( x1 ) 1
+ . + ... NOTES
( x1 − x0 ) ( x1 − x2 ) ... ( x1 − xn ) ( x − x1 )
f ( xn ) 1
+ . . ... (2.30)
( xn − x0 ) ( xn − x1 ) ... ( xn − xn −1 ) ( x − xn )

Another form of Lagrange’s Formula

Prove that the Lagrange’s formula can be put in the form,
n
φ( x ) f ( x )
Pn ( x ) = ∑ ( x − x ) φ′(rx )
r=0 r r

n
Where, φ(x) = ∏ ( x − xr )
r =0

d 
and φ′( xr ) =  [φ( x)]
 dx  x = xr
Proof:
You have the Lagrange’s formula,
n ( x − x0 ) ( x − x1) ... ( x − xr −1) ( x − xr +1) ... ( x − xn )
Pn(x) = ∑ (x − x0 ) ( xr − x1) ... ( xr − xr −1) ( xr − xr +1) ... ( xr − xn )
f ( xr )
r =0 r

n  φ( x)   f ( xr ) 
= ∑  
r = 0  x − xr  ( xr − x0 ) ( xr − x1 ) ... ( xr − xr −1) ( xr − xr +1) ... ( xr − xn ) 
...(2.31)
Now,
n
φ( x ) = ∏ ( x − xr )
r =0

= ( x − x0 ) ( x − x1 ) ... ( x − xr − 1 ) ( x − xr ) ( x − xr + 1 ) ... ( x − xn )
∴ φ′( x ) = ( x − x1 ) ( x − x2 ) ... ( x − xr ) ... ( x − xn )
+ ( x − x0 ) ( x − x2 ) ... ( x − xr ) ... ( x − xn ) + ...
+ ( x − x0 ) ( x − x1 ) ... ( x − xr −1 ) ... ( x − xr +1 ) ... ( x − xn ) + ...

+ ( x − x0 ) ( x − x1 ) ... ( x − xr ) ... ( x − xn +1 )

Self-Instructional Material 65
Interpolation
⇒ φ′( xr ) = [ φ′( x)]x = xr

= ( xr − x0 ) ( xr − x1 ) ... ( xr − xr −1 ) ( xr − xr +1 ) ... ( xr − xn ) ...(2.32)

NOTES
Hence from Equation (2.31),
n
φ( x ) f ( x )
Pn ( x) = ∑ ( x − x ) φ′(rx ) | Using Equation (2.32)
r =0 r r

Hence proved.
Example 2.21: Find the unique polynomial P(x) of degree 2 such that,
P(1) = 1, P(3) = 27, P(4) = 64
Use the Lagrange’s method of Interpolation.
Solution: Here, x0= 1, x 1 = 3, x2 = 4
f(x0) = 1, f(x1) = 27, f(x2) = 64
Lagrange’s Interpolation formula is,
( x − x1 ) ( x − x2 ) ( x − x0 ) ( x − x2 )
P(x) = f ( x0 ) + k f ( x1 )
( x0 − x1 ) ( x0 − x2 ) ( x1 − x0 ) ( x1 − x2 )

( x − x0 ) ( x − x1 )
+ f ( x2 )
( x2 − x0 ) ( x2 − x1 )

( x − 3) ( x − 4) ( x − 1) ( x − 4) ( x − 1) ( x − 3)
= (1) + (27) + (64)
(1 − 3) (1 − 4) (3 − 1) (3 − 4) (4 − 1) (4 − 3)

1 2 27 64
= ( x − 7 x + 12) − ( x 2 − 5 x + 4) + ( x 2 − 4 x + 3)
6 2 3
= 8x2 – 19x + 12
Hence, the required unique polynomial is,
P(x) = 8x2 – 19x + 12

CHECK YOUR PROGRESS

4. What is Newton’s formula used for?
5. When is Gauss’s forward difference formula applicable?
6. How would you derive Stirling’s formula?

66 Self-Instructional Material
Interpolation
2.6 NEWTON’S DIVIDED DIFFERENCE
INTERPOLATION FORMULA

Let y0, y1, ..., yn be the values of y = f(x) corresponding to the arguments x0, x1, NOTES
..., xn, then from the definition of divided differences, you have,
y − y0
[x, x0] =
x − x0

so that, y = y0 + ( x – x0 ) [ x, x0 ] ...(2.33)

[ x, x0 ] − [ x0 , x1 ]
Again, [x, x0, x1] =
x − x1
Which gives, [x, x0] = [x0, x1] + (x – x1) [x, x0, x1] (2.34)
From Equations (2.33) and (2.34)
y = y0 + (x – x0) [x0, x1] + (x – x0) (x – x1) [x, x01, x1] ...(2.35)
[ x, x0 , x1 ] − [ x0 , x1 , x2 ]
Also [x, x0, x1, x2] =
x − x2
Which gives, [x, x0, x1] = [x0, x1, x2] + (x – x2) [x, x0, x1, x2] (2.36)
From Equations (2.35) and (2.36),
y = y0 + ( x − x0 ) [ x0 , x1 ] + ( x − x0 ) ( x − x1 ) [ x0 , x1 , x2 ]

+ ( x − x0 ) ( x − x1 ) ( x − x2 ) [ x, x0 , x1 , x2 ]
Proceeding in this manner, you get,
y = f(x) = y0 + ( x − x0 ) [ x0 , x1 ] + ( x − x0 ) ( x − x1 ) [ x0 , x1 , x2 ]

+ ( x − x0 ) ( x − x1 ) ( x − x2 ) [ x0 , x1 , x2 , x3 ]

+ ... + ( x − x0 ) ( x − x1 ) ( x − x2 )

... ( x − xn −1 ) [ x0 , x1 , x2 , x3 , ..., xn ]

+ ( x − x0 ) ( x − x1 ) ( x − x2 )

... ( x − xn ) [ x, x0 , x1 , x2 , ..., xn ]
This is called Newton’s General Interpolation formula with divided differences.
The last term being the remainder term after (n + 1) terms.

Self-Instructional Material 67
Interpolation Example 2.22: Referring to the following table, find the value of f(x) at point
x = 4:
x: 1.5 3 6
NOTES f(x): –0.25 2 20.
Solution: The divided difference table is shown below:
Divided Difference Table for Finding the Value of f(x)

x f(x) ∆| f(x) ∆| 2 f(x)

1.5 − 0.25

1.5

3 2 1
6
6 20

Applying Newton’s divided difference formula,

f(x) = –0.25 + (x – 1.5) (1.5) + (x – 1.5) (x – 3) (1)
Putting x = 4, you get
f(4) = 6.
Example 2.23: Using Newton’s divided difference formula prove that,
( x + 1) x 2
f(x) = f (0) + x∆f ( −1) + ∆ f ( −1)
2!

( x + 1) x ( x − 1) 3
+ ∆ f ( −2) + ...
3!
Solution: Taking the arguments, 0, – 1, 1, –2, ... the divided Newton’s difference
formula is,
2
f(x) = f (0) + x ∆| f (0) + x ( x + 1) ∆| f (0)
−1 −1, 1

+ x(x + 1) (x – 1) ∆| 3 f ( −1) +...

−1, 1, − 2

2
= f (0) + x ∆| f ( −1) + x( x + 1) ∆| f ( −1)
0 0, 1

+ ( x + 1) x ( x – 1) ∆| 3 f ( −2) + ...
−1, 0, 1

68 Self-Instructional Material
Interpolation
f (0) − f (−1)
Now, ∆| f (−1) = = ∆f (−1)
0 0 − (−1)

1
∆| 2 f ( −1) = [ ∆| f (0) − ∆| f (−1)] NOTES
0, 1 1 − (−1) 1 01

1 1
= [ ∆f (0) − ∆f ( −1)] = ∆ 2 f ( −1)
2 2

1
∆| 3 f (−2) = [ ∆| 2 f (−1) − ∆| 2 f (−2)]
−1, 0, 1 1 − (−2) 0, 1 −1, 0

1  ∆ 2 f (−1) ∆ 2 f (−2) 
=  − 
3  2 2 

∆ 3 f (−2) ∆ 3 f (−2)
= = and so on.
3.2 3
Substituting these values, you get,
( x + 1) x 2
f(x) = f (0) + x∆f ( −1) + ∆ f ( −1)
2!

( x + 1) x ( x − 1) 3
+ ∆ f ( −2) + ...
3!

2.7 DISCRETE LEAST SQUARES APPROXIMATION

Consider the problem of estimating the values of a function at non-tabulated points,

given the experimental data in Table 2.8.
Table 2.8 Experimental Data

xi yi xi yi
1 1.3 6 8.8
2 3.5 7 10.1
3 4.2 8 12.5
4 5.0 9 13.0
5 7.0 10 15.6

Self-Instructional Material 69
Interpolation y

14
NOTES
12

6
4

x
2 4 6 8 10
Interpolation requires a function that assumes the value of yi and xi. Each i
= 1, 2,...,10. From this graph, it appears that the actual relationship between x
and y is linear. However, it is likely that no line precisely fits the data, because of
errors in the data.
The least squares approach to this problem involves determining the best
approximating line when the error involved is the sum of the squares of the differences
between the y-values on the approximating line and the given y-values. Hence,
constants a0 and a1 must be found that minimize the total least squares error.
10 10
E = ∑ | yi − (a1xi + a0 ) | = ∑ ( yi − (a1xi + a0 ))2
2

i =1 i =1

where, ai xi + a0 denotes the ith value on the approximating line and yi is the ith
given y-value. E is the error.
The least squares method is the most convenient procedure for determining
best linear approximations, and there are also important theoretical considerations
that favour this method. The least squares approach puts substantially more weight
on a point that is out of line with the rest of the data but will not allow that point to
dominate the approximation.
The general problem of fitting the best least squares line to a collection of
data {( xi , yi )}im= 1 involves minimizing the total error
m
E = ∑ ( yi − (a1xi + a0 ))2
i =1
with respect to the parameters a0 and a1. For minimum, find the partial derivatives
of a1 and a0 and put them equal to zero.
m m
∑ ( yi − (a1xi + a0 )) = 2 ∑ ( yi − a1 xi + a0 ) (−1)
∂ 2
0=
∂a0 i =1 i =1
70 Self-Instructional Material
Interpolation
∂ m m
and 0= ∑ i 1i 0
∂a1 i =1
( y − ( a x + a )) 2
= 2 ∑ ( yi − a1xi − a0 ) (− xi )
i =1
These equations simplify to the normal equations
NOTES
m m m m m
a0 ∑ xi + a1 ∑ xi2 = ∑ xi yi and a0 . m + a1 ∑ xi = ∑ yi .
i =1 i =1 i =1 i =1 i =1

The solution to this system is as follows.

The linear least squares solution for a given collection of data {( xi , yi )}im=1
has the form y = a1 x + a0 , where
(Σim=1xi2 ) (Σim=1 yi ) − (Σim=1 xi yi ) (Σim=1 xi )
a0 =
m (Σim=1 xi2 ) − (Σim=1 xi ) 2

m (Σim=1 xi yi ) − (Σim=1 xi ) − (Σim=1 yi )

and a1 = .
m (Σim=1 xi2 ) − (Σim=1 xi ) 2
Consider the data presented in Table 2.9. To find the least squares line
approximating this data, extend the table as shown in the third and fourth columns
of Table 2.9 and sum the columns.
Table 2.9 Finding the Least Squares Line

xi yi xi 2 x iy i P(xi) = 1.538xi – 0.360

1 1.3 1 1.3 1.18
2 3.5 4 7.0 2.72
3 4.2 9 12.6 4.25
4 5.0 16 20.0 5.79
5 7.0 25 35.0 7.33
6 8.8 36 52.8 8.87
7 10.1 49 70.7 10.41
8 12.5 64 100.0 11.94
9 13.0 81 117.0 13.48
10 15.6 100 156.0 15.02

55 81.0 385 572.4 E = Σ10 2

i = 1 ( yi − P( xi )) ≈ 2.34

Solving the normal equations produces

385(81) − 55(572.4)
a0 = = – 0.360
10(385) − (55)2

10(572.4) − 55(81)
and a1 = = 1.538.
10(385) − (55)2
Self-Instructional Material 71
Interpolation
So, P(x) = 1.538x – 0.360.
The problem of approximating a set of data, {(xi, yi) | t = 1, 2, ..., m}, with
an algebraic polynomial
NOTES
Pn(x) = an x n + an − 1 x n − 1 + ... + a1 x + a0
of degree n < m – 1 using least squares is handled in a similar manner. It requires
choosing the constants a0 , a1 , ..., an to minimize the total least squares error..
m
E = ∑ ( yi − Pn ( xi ))2 .
i =1

For E2 to be minimized, it is necessary that ∂E2 /∂a j = 0 for each j = 0,

1, ..., n. This gives n + 1 normal equations in the n + 1 unknown, aj
m m m m m
a0 ∑ xi0 + a1 ∑ x1i + a2 ∑ xi2 + ... + an ∑ xim = ∑ yi xi0 ,
i =1 i =1 i =1 i =1 i =1

m m m m m
a0 ∑ x1i + a1 ∑ xi2 + a2 ∑ xi3 + ... + an ∑ xin + 1 = ∑ yi xi1,
i =1 i =1 i =1 i =1 i =1

M
m m m m m
a0 ∑ xin + a1 ∑ xin +1 + a2 ∑ xin + 2 + ... + an ∑ xi2n = ∑ yi xin .
i =1 i =1 i =1 i =1 i =1

The normal equations will have a unique solution, provided that the xi are
distinct.
Continuous Least Squares Approximation
Suppose f ∈ C| a, b | and you want a polynomial of degree at most n,
n
n
Pn(x) = an x + an −1x
n −1
+ ... + a1x + a0 = ∑ ak x k .
k =0

to minimize the error

2
b 
n
E (a0 , a1 , ..., an ) = ∫ ( f ( x) − Pn ( x))2 dx = ∫  f ( x) − ∑ ak x k  dx.
b
a a 
 k =0 
A necessary condition for the numbers a0 , a1 , ..., an to minimize the total
error E is that

72 Self-Instructional Material
Interpolation
∂E
(a0 , a1 , ..., an ) = 0 for each j = 0, 1, ..., n.
∂a j

You can expand the integrand in this expression to

NOTES
2
n b n 
∑ ak f ( x) dx + ∫  ∑ ak x k  dx,
b b k
E= ∫a ( f ( x))2 dx − 2 ∫ a
x
a 
k =0  k =0 
n
∑ ak ∫a x j + k dx.
∂E b b
∫a
j
So, (a0 , a1 , ..., an ) = – 2 x f ( x) dx +
∂a j k =0

for each j = 0, 1, ..., n. Setting these to zero and rearranging, you obtain the
(n + 1) linear normal equations
n
∑ ak ∫a x j + k dx = ∫a x j f ( x) dx,
b b
for each j = 0, 1, ..., n,
k =0

which must be solved for the n + 1 unknown a0, a1, ..., an. The normal equations
have a unique solution provided that f ∈ C [a, b].
Example 2.24: Find the least squares approximating polynomial of degree 2 for
the function f(x) = sin πx on the interval [0, 1].

Solution: The normal equations for P2(x) = a2 x 2 + a1x + a0 are

1 1 1 2 1
a0 ∫ 1dx + a1 ∫ xdx + a2 ∫0 x dx = ∫0 sin πx dx,
0 0

1 1 1 3 1
a0 ∫ xdx + a1 ∫ x 2 dx + a2 ∫ x dx = ∫0 x sin πx dx,
0 0 0

1 1 1 4 1 2
a0 ∫ x2dx + a1 ∫ x3dx + a2 ∫0 x dx = ∫0 x sin πx dx,
0 0

Performing the integration yields

1 1 2 1 1 1 1 1 1 1 π2 − 4
a0 + a1 + a2 = , a0 + a1 + a2 = , a0 + a1 + a2 = .
2 3 π 2 3 4 π 3 4 5 π3
These three equations in three unknown can be solved to obtain

12π2 − 120 720 − 60π2

a0 = ≈ − 0.050465 and a1 = −a2 = ≈ 4.12251.
π3 π3

Self-Instructional Material 73
Interpolation Consequently, the least squares polynomial approximation of degree 2 for
2
f(x) = sin πx on [0, 1] is P2 ( x) = − 4.12251x + 4.12251x − 0.050465. (See
Figure 2.1)
NOTES y

1.0
0.8
0.6
P2(x)
0.4
0.2
x
0.2 0.4 0.6 0.8 1.0

Figure 2.1 Least Square Polynomial Approximation of Degree 2

Figure 2.1 illustrates the difficulty in obtaining a least square polynomial

approximation. An (n + 1) × (n + 1) linear system for the unknowns a0, ..., an
must be solved, and the coefficients in the linear system are of the form

b b j + k +1 − a j + k +1
∫a x j + k dx =
j + k +1
.

The matrix in the linear system is known as a Hilbert matrix, which is a

classic example for demonstrating round-off error difficulties.
Trigonometric Polynomial Approximation
Trigonometric functions are approximate functions that have periodic behaviour,
functions with the property that for somes constant T, f(x + T) = f(x) for all x. You
can generally transform the problem so that T = 2π and restrict the approximation
to the interval [–π, π].
For each positive integer n, the set Tn of Trigonometric polynomials of
degree less than or equal to n is the set of all linear combinations of {φ0, φ1, ...,
φ2n – 1}, where
1
φ0(x) = ,
2
φk(x) = cos kx, for each k = 1, 2, ..., n.
and φn + k(x) = sin kx, for each k = 1, 2, ..., n – 1.
(Some sources include an additional function in the set, φ2 n ( x) = sin nx.)
The set {φ0 φ1 , ..., φ2n − 1} is orthogonal on [ −π, π] with respect to the
weight function w(x) = 1. This follows from a demonstration similar to that which
74 Self-Instructional Material
shows that the Chebyshev polynomials are orthogonal on [–1, 1]. For example, if Interpolation

k ≠ j and j ≠ 0.
π π
∫−π φn + k ( x) φ j ( x) dx = ∫−π sin kx cos jx dx
NOTES
The trigonometric identity
1 1
sin kx cos jx = sin ( k + j ) x + sin ( k − j ) x
2 2
can now be used to give
1 π
∫−π φn + k ( x) φ j ( x) dx = 2 ∫−π (sin(k + j ) x + sin(k − j ) x) dx
π

π
1  − cos(k + j ) x − cos(k − j ) x 
=  −  = 0,
2 k+ j k− j  −π
since cos(k + j)π = cos(k + j) (–π) and cos(k – j)π = cos (k – j) (–π). The result
also holds when k = j, for in this case you have sin(k – j)x = sin 0 = 0.
Showing orthogonality for the other possibilities from {φ0 , φ1 , ..., φ2n − 1}
is similar and uses the appropriate trigonometric identities from the collection
1
sin jx cos kx = (sin( j + k ) x + sin( j − k ) x )
2
1
sin jx sin kx = (cos( j − k ) x + cos( j + k ) x )
2
1
cos jx cos kx = (cos( j + k ) x + cos( j − k ) x )
2
to convert the products into sums.
Given f ∈ C[–π, π], the continuous least squares approximation by functions
in Tn is defined by
2n − 1 2n − 1
1 1
Sn(x) = 2 0 ∑ k k
a + a φ ( x ) =
2
a0 + ∑ ak .cos kx
k =1 k =1

1 π
Where, ak =
π ∫−π f ( x) φk ( x) dx for each k = 0, 1, ..., 2n – 1.

The limit of Sn(x) as n → ∞ is called the Fourier series of f(x). Fourier

series are used to describe the solution of various ordinary and partial-differential
equations that occur in physical situations.
To determine the trigonometric polynomial from Tn that approximates
f(x) = | x | for –π<x<π
requires finding
Self-Instructional Material 75
Interpolation
1 π 1 0 1 π 2 π
a0 =
π ∫−π | x | dx = − π ∫−π x dx + π ∫−π x dx = π ∫0 x dx = π.
1 π 2 π
NOTES ak =
π ∫−π | x | cos kx dx =
π ∫0 x cos kx dx
2
= 2
((−1) k − 1), for each k = 1, 2, ..., n
πk
and the coefficients an + k . The coefficients an + k in the Fourier expansion are
commonly denoted bk, that is, bk = an + k for k = 1, 2, ..., n – 1. In the example,
you have
1 π
bk =
π ∫−π | x | sin kx dx = 0, for each k = 1, 2, .., n – 1.

since the integrand is an odd function. The trigonometric polynomial from Tn

approximating f is, therefore,

π 2 n
(−1)k − 1
Sn(x) = +
2 π
∑ k2
cos kx.
k =1

The first few trigonometric polynomials for f(x) = | x | are shown in

Figure 2.2.

y = |x|
4 4
y = S3(x) = cos x – cos 3x
2 9
y = S1(x) = S 2(x) = 4 cos x
2
2
y = S0(x) =
2

2 2
Figure 2.2 First few trigonometric polynomials for f(x) = | x |

The Fourier series for f(x) = | x | is

∞
π 2 ( −1) k − 1
S(x) = lim S n ( x) =
n→∞
+
2 π
∑ k2
cos kx.
k =1

Since | cos kx | ≤ 1 for every k and x, the series converges and S(x) exists
for all real numbers x.
2.7.1 Fourier Series
There is a discrete analog to Fourier series that is useful for the least squares
approximation and interpolation of large amounts of data when the data are given

76 Self-Instructional Material
at equally spaced points. Suppose that a collection of 2m paired data points Interpolation

{( x j , y j )}2j m= −0 1 is given, with the first elements in the pairs equally partitioning a
closed interval. For convenience, you may assume that the interval is [–π, π] and
that, NOTES
 j
x j = −π +   π for each j = 0, 1, ..., 2m – 1.
m
If this were not the case, a linear transformation could be used to change
the data into this form.

0 = xm

Figure 2.3 Linear Transformation to Change Data

The goal is to determine the trigonometric polynomial, Sn(x), in Tn that

minimizes
2m − 1
E(Sn) = ∑ ( y j − S n ( x j ))2 .
j=0
That is, yo u want to choose the constants
a0 , a1 , ..., an and b1 , b2 , ..., bn − 1 to minimize the total error
2
2m − 1  a n −1 
E(Sn) = ∑  y −  0
+ a cos nx + ∑ k ( a cos kx + b sin kx j  .
) 

j
 2
n j j k

j=0   k =1 
The determination of the constants is simplified by the fact that the set is
2m − 1
orthogonal with respect to summation over the equally spaced points {x j } j = 0
in [ −π, π]. By this it is meant that, for each k ≠ l,
2m − 1
∑ φk ( x j ) φl ( x j ) = 0.
j=0

The orthogonality follows from the fact that if r and m are positive integers
with r < 2m, you have
2m − 1 2m − 1
∑ cos rx j = 0 and ∑ sin rx j = 0.
j=0 j=0

To obtain the constants ak for k = 0, 1, ..., n and bk for k = 1, 2, ..., n – 1

in the summation
n −1
+ an cos nx + ∑ (ak cos kx + bk sin kx),
a0
Sn(x) =
2 k =1

you minimize the least squares sum

Self-Instructional Material 77
Interpolation 2m − 1
E (a0 , ..., an , b1, ..., bn − 1 ) = ∑ ( y j − Sn ( x j )) 2
j=0

by setting to zero the partial derivatives of E with respect to the ak’s and the bk’s.
NOTES This implies that

1 2m − 1
ak = ∑ y j cos kx j ,
m j=0
for each k = 0, 1, ..., n

1 2m − 1
and bk = ∑ y j sin kx j ,
m j=0
for each k = 1, 2, ..., n – 1.

Let f(x) = x4 – 3x3 + 2x2 – tan x(x – 2). To find the discrete least squares
approximation S3 for the data {( x j , y j )}9j = 0 , where xj = j/5 and yj = f(xj), first
requires a transformation from [0, 2] to [–π, π]. The required linear transformation
is
zj = π(xj – 1),
and the transformed data are of the form
9
  z j   
 z j , f 1 +    .
  π   
j=0
Consequently, the least squares trigonometric plynomial is
2
+ a3 cos 3 z+ ∑ (ak cos kz + bk sin kz ),
a0
S3(z) =
2 k =1

1 9  zj 
Where, ak =
5
∑ f 1 +  cos kz j , for k = 0, 1, 2, 3
j=0  π

1 9  zj 
and bk =
5
∑ f 1 +  sin kz j , for k = 1, 2.
j=0  π

Evaluating these sums produces the approximation

S3(z) = 0.76201 + 0.77177 cos z + 0.017423 cos 2z
+ 0.0065673 cos 3z – 0.389676 sin z + 0.047806 sin 2z.
Converting back to the variable x gives
S3(x) = 0.76201 + 0.77177 cos π(x – 1)
+ 0.017423 cos 2π(x – 1) + 0.0065673 cos 3π(x – 1)
– 0.38676 sin π(x – 1) + 0.047806 sin 2π(x – 1).

78 Self-Instructional Material
Table 2.10 lists values of f(x) and S3(x). Interpolation

Table 2.210 Values of f(x) and S3(x)

x f(x) S3 (x) |f(x) – S3(x)|

NOTES
–2
0.125 0.26440 0.24060 2.38 × 10
0.375 0.84081 0.85154 1.07 × 10–2
0.625 1.36150 1.36248 9.74 × 10–4
0.875 1.61282 1.60406 8.75 × 10–3
1.125 1.36672 1.37566 8.94 × 10–3
1.375 0.71697 0.71545 1.52 × 10–3
1.625 0.07909 0.06929 9.80 × 10–3
1.875 –0.14576 –0.12302 2.27 × 10–2

Example 2.25: Find the value of log 323.5 from the following data:
x: 321 322.8 324.2 325
log10 x: 2.55051 2.50893 2.51081 2.51188

Solution: Since it is unequal interpolation, you should make use of Newton’s

divided difference interpolation formula. The divided difference table is shown
below:
Divided Difference Table for Finding the Values of log 323.5

x y = log10x ∆| y ∆|2 y ∆|3 y

321 2.50051
0.00134444
–0.00000158
322.8 2.50893 –0.00000022
0.00134286
324.2 2.51081 –0.00000244
0.00133750
325 2.51188

Here, x 0 = 322.8, y0 = 2.50893, x1 = 324.2, x2 = 325

and x = 323.5.
The Newton’s divided difference formula is,
y = y0 + (x – x0). ∆| y0 + (x – x0) (x – x1). ∆|2 y0 + ...

Self-Instructional Material 79
Interpolation Thus,
log 323.5 =2.50893 + (323.5 – 322.8) × (0.00134286) + (323.5 – 322.8)
× (323.5 – 324.2) × (–0.00000244) + (323.5 – 322.8) ×
(323.5 – 324.2) . (323.5 – 325) × (–0.00000022)
NOTES
= 2.50987

CHECK YOUR PROGRESS

7. List the merits of Lagrange’s formula?
8. What does the least square approach involve?
9. What are trigonometric functions used for?
10. What are Fourier series used to describe?

2.8 SUMMARY
In this unit, you have learned that:
• The process of estimating or approximating the value of independent variable
y for a given value of x in the specified range say xi ≤ x ≤ xn is known as
Interpolation. If the approximating function φ(x) is a polynomial, the process
is known as Polynomial Interpolation.
• If you want to estimate the value of y for any given value of x which is
outside the range xi ≤ x ≤ xn, then the process of approximating y = f(x)
by another function φ(x) is known as Extrapolation.
• Interpolation is very important in numerical analysis as it provides the base
for numerical differentiation and integration.
• The study of interpolation is based on the calculus of finite differences.
• Finite differences deal with the variations in a function corresponding to the
changes in the independent variables.
• y = f(x) is a function where x is the independent and y is the dependent
variable.
• Finite differences are the differences between the values of the function or
differences between the past differences. There are four types of differences
which are forward difference, backward difference, central difference and
divided difference.
• Central difference is denoted by δ and is defined as:
– δy1/2 = y1 – y0, δy3/2 = y2 – y1
δyn – 1/2 = yn – yn – 1 and so on.

80 Self-Instructional Material
• Higher order difference are obtained as: Interpolation

δ2 y 1 = δy3/2 − δy1/2 ; δ 2 y2 = δy5/2 − δy3/2

δ3 y3/2 = δ2y2 – δ2y1 and so on.
NOTES
• The nth difference of a polynomial of nth degree is constant and all higher
order differences are zero when the values of the independent variable are
at equal interval.
• A product of the form x(x – 1) (x – 2) ... (x – r + 1) is denoted by [x]r and
is called factorial.
• Newton’s formula is used for constructing the interpolation polynomial. It
makes use of divided differences.
• The ‘least squares method’ is the most convenient procedure for determining
best linear approximations,
• The least squares approach puts substantially more weight on a point that is
out of line with the rest of the data but will not allow that point to dominate
the approximation.
• Trigonometric functions are used to approximate functions that have periodic
behaviour, functions with the property that for some constant T, f(x + T) =
f(x) for all x.
• There is a discrete analog to Fourier series that is useful for the least squares
approximation and interpolation of large amounts of data when the data are
given at equally spaced points.

2.9 KEY TERMS

• Interpolation: It is the process of estimating or approximating the value of

independent variable y for a given value of x in the specified range say xi ≤
x ≤ xn .
• Extrapolation: It is the process of approximating y = f(x) by another function
φ(x) used to estimate the value of y for any given value of x which is outside
the range xi ≤ x ≤ xn.
• First forward difference: They are the differences y1 – y0, y2 – y1 ..., yn
– yn – 1.
• Entries: They are the higher order differences that can be expressed in
terms of the values of y.

2.10 ANSWERS TO ‘CHECK YOUR PROGRESS’

1. If you want to calculate the value of independent variable x for a given value
of dependent variable y, the process is known as inverse interpolation.
Self-Instructional Material 81
Interpolation 2. The differences y1 – y0, y2 – y1, y3 – y2, ..., yn – yn – 1 when denoted as
∇y1, ∇y2,..., ∇yn, respectively, are called the first backward differences.
3. Shift operator (E) is defined by the equation
NOTES Eyr = yr + 1 or Ef(x) = f(x + h)
which shows that the effect of E is to shift the functional value yr to the next
higher value yr + 1.
4. Newton’s formula is used for constructing the interpolation polynomial. It
makes use of divided differences.
5. Gauss’ forward difference formula is applicable when a lies between 0 and
1.
2
6. Gauss’s forward and backward formulas are used to derive Stirling’s
formula.
7. The merits of Lagrange’s formula are as follows:
• The formula is simple and easy to remember.
• There is no need to construct the divided difference table and you
can directly interpolate the unknown value with the help of given
observations.
8. The least square approach involves determining the best approximating line
when the error involved is the sum of the squares of the differences between
the y-values on the approximating line and the given y-values.
9. Trigonometric functions are used to approximate functions that have periodic
behaviour, functions with the property that for some constant
T, f(x+T) = f(x) for all x.
10. Fourier series are used to describe the solution of various ordinary and
partial-differential equations that occur in physical situations.

2.11 QUESTIONS AND EXERCISES

Short-Answer Questions
1. What are first forward differences?
2. What is forward difference operator?
3. What is backward difference operator?
4. What is meant by feature notation [x]r?
5. What is central difference operator?
6. What are divided difference?

82 Self-Instructional Material
Long-Answer Questions Interpolation

1. The values of a polynomial of degree 5 are tabulated below:

If f(3) is known to be in error, find its correct value.
x: 0 1 2 3 4 5 6 NOTES
f(x): 1 2 33 254 1025 3126 7777.
2. If y = f(x) is a polynomial of degree 3 and the following table gives the
values of x and y, locate and correct the wrong values of y.
x: 0 1 2 3 4 5 6
y: 4 10 30 75 160 294 490.
4. Prove that:
1 1 1
x2 + (1 + x)2 + 2 (2 + x) 2 + 3 (3 + x) 2 + ... = 2( x 2 + 2 x + 3)
2 2 2
Use the calculus of finite differences and take the interval of difference unity.
[Hint: (1 + x)2 = Ex2, (2 + x)2 = E2x2, (3 + x)2 = E3x3, ...]
5. Use the following given the values:
x: 5 7 11 13 17
f(x): 150 392 1452 2366 5202
Evaluate f(9) using Newton’s divided difference formula.
6. Find the continuous least square trigonometric polynomial S3(x) for
f(x) = ex on [–π, π].
7. Prove the following identities:
(i) u x − ∆ 2u x + ∆ 3u x − ∆ 5u x + ∆ 6u x − ∆8u x + ....

= u x − ∆ 2u x −1 + ∆ 4u x − 2 − ∆ 6u x − 3 + ∆8u x −4 − ....
∞
1 ∞ 1 ∆ ∆2 
(ii) ∑ ∑ x 4  2 4  u0 .
u2 x =
2 x=0
u +  1 − + − ... 
x=0  
8. If f(E) is a polynomial in E such that,
f(E) = a0 E n + a1E n −1 + a2 E n − 2 + ... + an
Prove that f(E) ex = ex f(e), taking the interval of differencing unity.
We now proceed to study the use of finite difference calculus for the purpose
of interpolation. This we shall do in three cases as follows:
(i) The value of the argument in the given data varies by an equal
interval. The technique is called an interpolation with equal
intervals.

Self-Instructional Material 83
Interpolation (ii) The values of argument are not at equal intervals. This is known
as interpolation with unequal intervals.
(iii) The technique of central differences.
NOTES 9. The following table gives the distance in nautical miles of the visible horizon
for the given heights in feet above the earth’s surface.
x: 100 150 200 250 300 350 400
y: 10.63 13.03 15.04 16.81 18.42 19.9 21.27
Use Newton’s forward formula to find y when x = 218 ft.
10. Use Newton’s divided difference formula to find f(7) if, f(3) = 24, f(5) =
120, f(8) = 504, f(9) = 720, and f(12) = 1716.
11. Find the least squares polynomials of degrees 1, 2, and 3 for the data in the
following table. Compute the error E in each case.
xi 1.0 1.1 1.3 1.5 1.9 2.1
yi 1.84 1.96 2.21 2.45 2.94 3.18

2.12 FURTHER READING

Mott, J.L. Discrete Mathematics for Computer Scientists, 2nd Edition. New
Delhi: Prentice-Hall of India Pvt. Ltd., 2007.
Bonini, Charles P., Warren H. Hausman, and Harold Bierman. Quantitative
Analysis for Business Decisions. Illinois: Richard D. Irwin, 1986.
Charnes, A., W.W. Cooper, and A. Henderson. An Introduction to Linear
Programming. New York: John Wiley & Sons, 1953.
Hogg, Robert. V., Allen T. Craig and Joseph W. McKean. Introduction to
Mathematical Statistics. New Delhi: Pearson Education, 2005.

84 Self-Instructional Material
Numerical Integration

UNIT 3 NUMERICAL INTEGRATION and Differential

Interpolation

AND DIFFERENTIAL
NOTES
INTERPOLATION
Structure
3.0 Introduction
3.1 Unit Objectives
3.2 Numerical Differentiation
3.3 Differentiating a Tabulated Function
3.4 Differentiating a Graphical Function
3.5 Numerical Integration Interpolation Formulae
3.5.1 Newton–Cotes Quadrature Formulae
3.5.2 Trapezoidal Rule (n = 1)
3.5.3 Simpson’s (1/3) Rule (n = 2)
3.5.4 Boole’s Rule (n = 4)
3.5.5 Weddle’s Rule (n = 6)
3.5.6 Gauss Quadrature Formulae
3.5.7 Errors in Quadrature Formulae
3.6 Approximation Theory
3.7 Summary
3.8 Key Terms
3.9 Answers to ‘Check Your Progress’
3.10 Questions and Exercises
3.11 Further Reading

3.0 INTRODUCTION

In science and engineering, many problems involve differentiation of various types

of functions. In this unit, you will learn that any function which is expressed in
mathematical form can be differentiated by analytical methods. Let y = f(x) be
expressed in mathematical form, then its first order derivative is defined as,
dy f ( x + δx) − f ( x)
or f ′( x ) = Lim
dx δx →∞ δx
where δx is the small increase in the value of x. However, if the values of the
independent variable x and the corresponding dependent variable y are given in
tabular form or in graphic form then it is not possible to find the derivatives by
using analytical methods. Some numerical techniques are used to approximate the
d2y
values of dy and 2 by using the tabular data. The set of values (xi, yi) of a
dx dx
function y = f(x) is taken to compute the result. The process of computing or
n
approximating the derivatives d ny (n ≥ 1) of f(x) at some values of x from the
dx
given data is known as numerical differentiation.
Self-Instructional Material 85
Numerical Integration You will also learn that derivatives can be computed by first replacing the
and Differential
Interpolation function y = f(x) by the best interpolating polynomial y = φ(x) and then differentiating
it later as many times as you desire. The selection of interpolation formula will
depend on the assigned value of x at which derivative is desired.
NOTES
If derivatives are required near the beginning of the table and the values of x are
equispaced, you make use of Newton’s forward formula. Newton’s backward formula
will do the needful if derivates are required near the end of the equispaced table.
If the derivates are required in the middle of the equispaced table, you can
make use of Stirling’s or Bessel’s formula. Newton’s divided difference or
Lagrange’s Interpolation formula is to be used if the values of x are not equi spaced.

3.1 UNIT OBJECTIVES

After going through this unit, you will be able to:

• Explain numerical differentiation
• Describe a tabulated function
• Understand numerical integration interpolation formulae
• Comprehend the approximation theory

3.2 NUMERICAL DIFFERENTIATION

Sir Isaac Newton had proposed Interpolation formulae for forward and backward
interpolation. These are used for numerical differentiation. Such tools are widely
used in the field of engineering, statistics and other branches of mathematics.
Computer science also uses these concepts to find nearly accurate solution for
differentiation.
Forward interpolations
Sir Isaac Newton had proposed a formula for forward interpolation that bears his
name. It is expressed as a finite difference identity from which an interpolated
value, in between tabulated points using first value y0 with powers of forward
difference is used. Forward difference is shown by using ∆ which is known as
forward difference, operator. Forward difference is defined as the value obtained
by subtracting the present value from the next value. If initial value is y0 and next
value is y1 then ∆y0 = y1 – y0. In a similar way ∆2 is used. The ∆2y0 = ∆y1 – ∆y0.
Proceeding this way, you may write for first forward difference, second forward
difference and like wise of nth forward difference as follows:
∆y0 = y1 – y0 ∆2y0 = ∆y1 – ∆y0 .... ∆ny0 = ∆n – 1y1 – ∆n – 1y0
Taking this difference you may denote the next term (s) and thus,
y 1 = y0 + ∆y0 ⇒ y1 = (1 + ∆)y0

86 Self-Instructional Material
Here, 1 + ∆ shows a forward shift and a separate operator E, known as Numerical Integration
and Differential
forward shift operator is used as E = 1 + ∆. Now in the light of this fact, you may Interpolation
write y1 = Ey0 and y2 = Ey1 = E(Ey0) = E2y0. Proceeding this way, you may write
yn = En – 1 y0.
NOTES
Backward interpolations
Just as there is forward different operators for forward interpolations, there are
backward difference operator for backward difference interpolation. This is also
credited to Newton. In forward you think to next, but in backward you think of
the preceding term, i.e., the one earlier to it. Backward difference are denoted by
backward difference operator ∇ and is given as:
yn – 1 = yn – ∇yn ⇒ ∇yn = yn – yn – 1 and yn – 1 = (1 – ∇)yn
Just as in forward difference it was y0 in backward difference operator it is
yn.
Thus, ∇y1 = y1 – y0 ∇y2 = y2 – y1 ⇒ ∇2y2 = ∇y2 – ∇y1 and proceeding this
way you get, ∇nyn = ∇n – 1 yn – ∇n – 1 yn – 1
Relation between difference operators
E =1+∇ ⇒ ∇=E–1
∇(Ey0) = ∇(y1) = y1 – y0
Thus, (1 – ∇) Ey0 = Ey0 – ∇(Ey0) = y1 – ∇(y1) = y1 – (y1 – y0) = y0
or, (1 – ∇) (1 + ∆)y0= y0 which is true for all the terms of y, i.e., y0, y1, y2, .... yn.
Thus, (1 – ∇) (1 + ∆) = 1 and (1 – ∇)–1 = (1 + ∆) = E
and also, ∆ = (1 – ∇) –1 –1.

Central difference operator

Forward shift operator if applied to a term shows the next term. Let any term
corresponding to the value of x be denoted as f(x) instead of y and with a very
small increment of h, when value of x becomes x + h, it is denoted by f(x + h), the
next term of y. Using forward shift operator, the same can also be written as Ef(x)
= f(x + h). You can also view the same as Eyn –1 = yn. If f(x) shows first term, then
f(x + h) shows the next term.
Central difference operator is defined as δf(x) = f(x + h/2) – f(x – h/2).
This is known as first central difference operator. Higher difference operator can
also be given as :
δ2f(x) = f(x + h) – 2f(x) + f(x – h)
and δn f(x) = δn – 1 f(x + h/2) – δn – 1 f(x – h/2)
In following paragraphs, Newton’s formulae for forward and backward
interpolation, Stirling ’s and Bessel’s central difference formulae is explained.

Self-Instructional Material 87
Numerical Integration (1) Newton’s forward difference interpolation formula
and Differential
Interpolation k ( k − 1) 2 k ( k − 1) ( k − 2) 3
y = y0 + k∆y0 + ∆ y0 + ∆ y 0 + ... ...(3.1)
2! 3!
NOTES where, k =
x−a
...(3.2)
h
Differentiating Equation (3.1) with respect to k, you get,
dy 2k − 1 2 3k 2 − 6 k + 2 3
= ∆ y0 + ∆ y0 + ∆ y0 + ... ...(3.3)
dk 2 6
Differentiating Equation (3.2) with respect to x, you get,
dk 1
= ...(3.4)
dx h
You know that,
dy dk 1   3k 2 − 6k + 2  3 
=  ∆y0 + 
dy 2k − 1  2
= .  ∆ y0 +   ∆ y0 + ...
dx dk dx h   2   6  
...(3.5)
dy
Equation (3.5) provides the value of at any x which is not tabulated.
dx
Equation (3.5) becomes simple for tabulated values of x in particular when x = a
and k = 0.
Putting k = 0 in Equation (3.5), you get,

 dy 
=  ∆y0 − ∆ 2 y0 + ∆3 y0 − ∆ 4 y0 + ∆ 5 y0 − ... ...(3.6)
1 1 1 1 1
   
 dx  x = a h  2 3 4 5
Differentiating Equation (3.5) with respect to x, you get
d2y d  dy  d  dy  dk
2 =  =  
dx dx  dx  dk  dx  dx
1 2 3  6 k 2 − 18k + 11  4 1
=  ∆ y 0 + ( k − 1) ∆ y 0 +   ∆ y0 + ...
h  12  h
1  2 3  6 k 2 − 18k + 11  4 
= 2  ∆ y 0 + ( k − 1) ∆ y 0 +   ∆ y0 + ...
h   12  
...(3.7)
Putting k = 0 in Equation (3.7), you get
 d2y 
= 2  ∆ 2 y0 − ∆ 3 y0 + ∆ y0 + ... 
1 11 4
 2 ...(3.8)
 dx x = a h  12 
Similarly, you get
 d3y  1  ∆ 3 y − 3 ∆ 4 y + ... 
 3 = 3  0 0  ...(3.9)
 dx  x = a h  2 
and so on.
88 Self-Instructional Material
Aliter: You know that, Numerical Integration
and Differential
E = ehD ⇒ 1 + ∆ = ehD Interpolation

∆ 2 ∆3 ∆ 4
∴ hD = log (1 + ∆) = ∆ –
+ − + ...
2 3 4 NOTES
1 1 2 1 3 1 4 
⇒ D =  ∆ − ∆ + ∆ − ∆ + ...
h 2 3 4 
Similarly,
2
D = 2  ∆ − ∆ 2 + ∆3 − ∆ 4 + ... 
2 1 1 1 1
h  2 3 4 
= 2  ∆ 2 − ∆3 + ∆ 4 − ∆5 + ... 
1 11 5
h  12 6 
D3 = 3  ∆ 3 − ∆ 4 
1 3
and
h  2 
(2) Newton’s backward difference interpolation formula
k ( k + 1) 2 k ( k + 1) ( k + 2) 3
y = yn + k ∇yn + ∇ yn + ∇ yn + ... ...(3.10)
2! 3!
x − xn
where, k = ...(3.11)
h
Differentiating Equation (3.10) with respect to k, you get,
dy  2k + 1  2  3k 2 + 6k + 2  3
= ∇yn +   ∇ yn +   ∇ yn + ... ...(3.12)
dk  2   6 
Differentiating Equation (3.11) with respect to x, you get,
dk 1
= ...(3.13)
dx h
dy dy dk
Now, = .
dx dx dx
1  2k + 1  2  3k 2 + 6k + 2  3 
= ∇ yn +   ∇ yn +   ∇ yn + ...
h  2   6  
...(3.14)
dy
Equation (3.14) provides the value of at any x which is not tabulated.
dx
At x = xh, you have k = 0
∴ Putting k = 0 in Equation (3.14), you get,
 dy 
=  ∇yn + ∇ 2 yn + ∇3 yn + ∇ 4 yn + ... 
1 1 1 1
  ...(3.15)
 dx  x = xn h  2 3 4 
Differentiating Equation (3.14) with respect to x, you get,
d2y d  dy  dk
=  
dx 2
dk  dx  dx
Self-Instructional Material 89
Numerical Integration
and Differential 1  2 3  6k 2 + 18k + 11  4 
Interpolation =  ∇ y + ( k + 1)∇ y +   ∇ y + ...
h2    
n n n
12
...(3.16)
NOTES Putting k = 0 in Equation (3.16), you get,
 d2y 
= 2  ∇ 2 yn + ∇3 yn + ∇ 4 yn + ... 
1 11
 2 ...(3.17)
 dx  x = x0 h  12 
 d3y  1  ∇3 y + 3 ∇ 4 y + ... 
Similarly, you get,  3  = 3   ...(3.18)
 
n n
 dx  x = x0 h 2
and so on.
Formulae for computing higher derivatives may be obtained by successive
differentiation.
Aliter: You know that,
E –1 = 1 – ∇
e–hD = 1 – ∇

∴ –hD = log (1 – ∇) = –  ∇ + 1 ∇ 2 + 1 ∇3 + 1 ∇ 4 + ... 

 2 3 4 
1 1 2 1 3 1 4 
⇒ D=  ∇ + ∇ + ∇ + ∇ + ... 
h 2 3 4 
2
1 
∇ + ∇ 2 + ∇3 + ... 
1 1
Also, D2 = 2 
h  2 3 
1  2
∇ + ∇3 + ∇ 4 + ... 
11
2 
=
h  12 
1  3 3 4
Similarly, D3 = 3 
∇ + ∇ + ...  and so on.
h  2 
(3) Stirling’s central difference interpolation formula

k  ∆y0 + ∆y−1  k 2 2 k (k 2 − 12 )  ∆3 y−1 + ∆ 3 y−2 

y = y0 +   + ∆ y −1 +  
1!  2  2! 3!  2 
k 2 (k 2 − 12 ) 4 k (k 2 − 12 ) (k 2 − 22 )  ∆5 y−2 + ∆5 y−3 
+ ∆ y−2 +   + ...
4! 5!  2 
...(3.19)
x−a
where, k = ...(3.20)
h
Differentiating Equation (3.19) with respect to k, you get,

dy ∆y0 + ∆y−1  3k 2 − 1   ∆3 y−1 + ∆3 y−2 

= + k ∆ 2 y−1 +   
dk 2  6  2 
90 Self-Instructional Material
 5k 4 − 15k 2 + 4   ∆5 y−2 + ∆5 y−3 
Numerical Integration
 4k 3 − 2 k  4 and Differential
+  ∆ y −2 +    + ...
 4!   5!  2  Interpolation
...(3.21)
Differentiating Equation (3.20) with respect to x, you get, NOTES
dk 1
= ...(3.22)
dx h
Now,
dy dy dk
= .
dx dk dx
1  ∆y + ∆y−1  3k 2 − 1   ∆3 y−1 + ∆3 y−2 
=  0 + k ∆ 2 y−1 +   
h 2  6  2 
 4k 3 − 2k  4  5k 4 − 15k 3 + 4   ∆5 y−2 + ∆5 y−3  
+  ∆ y−2 +    + ...
 4!   5!  2  
...(3.23)
dy
Equation (3.23) provides the value of at any x which is not tabulated.
dx
Given x = a, you have k = 0.
∴ Given k = 0 in Equation (3.23), you get,
1  ∆y0 + ∆y−1  1  ∆ y−1 + ∆ y−2  1  ∆ y−2 + ∆ y−3 
3 3 5 5
 dy 
  =   −   +   − ...
 dx  x = a h  2  6 2  30  2 
...(3.24)
Differentiating Equation (3.23) with respect to x, you get,
d2y d  dy  dk
=  
dx 2
dk  dx  ds
1  2  ∆3 y−1 + ∆3 y−2   6k 2 − 1  4
= 2  ∆ y−1 + k  +  ∆ y−2
h   2   12 
 2k 3 − 3k   ∆5 y−2 + ∆5 y−3  
+   + ... ...(3.25)
 12   2  
Given k = 0 in Equation (3.25), you get,
 d2y 
= 2  ∆ 2 y−1 − ∆ 4 y−2 − ∆ 6 y−3 − ... 
1 1 1
 2 ...(3.26)
 dx  x = a h  12 90 
and so on.
Formulae for computing higher derivatives may be obtained by successive
differentiation.
(4) Bessel’s central difference interpolation formula
y +y   1 k (k − 1)  ∆ 2 y−1 + ∆ 2 y0 
y =  0 1  +  k −  ∆y0 +  
 2   2 2!  2 
Self-Instructional Material 91
k (k − 1)  k − 
Numerical Integration
1
k (k + 1) (k − 1) (k − 2)  ∆ 4 y−2 + ∆ 4 y−1 
and Differential
Interpolation  2  3
+ ∆ y−1 +  
3! 4!  2 

k ( k − 1) ( k − 1) ( k − 2)  k − 
NOTES 1
+  2  ∆5 y
−2
5!
k ( k + 1) ( k + 2) ( k − 1) (k − 2) ( k − 3)  ∆ 6 y−3 + ∆ 6 y−2 
+   + ...
6!  2 
...(3.27)
x−a
where, k = ...(3.28)
h
Differentiating Equation (3.27) with respect to k, you get,
 3k 2 − 3k + 1 
dy  2k − 1   ∆ 2
y −1 + ∆ 2
y0
  2  ∆2 y
= ∆y0 +   +  −1
dk  2!   2   3! 
 4k 3 − 6k 2 − 2k + 2   ∆ 4 y−2 + ∆ 4 y−1   5k 4 − 10k 3 + 3k − 1  5
+  +  ∆ y−2
 4!  2   5! 
 6k 5 − 15k 4 − 20k 3 + 45k 2 + 8k − 12   ∆ 6 y−3 + ∆ 6 y−2 
+   + ... ...(3.29)
 6!  2 
Differentiating Equation (3.28) with respect to x, you get,
dk 1
=
dx h
dy dy dk
Now, = .
dx dk dx
  3k 2 − 3k + 1 
  2 2   2  ∆3 y
=  ∆y0 + 
1 2k − 1  ∆ y −1 + ∆ y0
 +  −1
h  2!   2   3! 

 4k 3 − 6k 2 − 2k + 2   ∆ 4 y−2 + ∆ 4 y−1   5k 4 − 10k 3 + 5k − 1  5

+  +  ∆ y−2
 4!  2   5! 

 6k 5 − 15k 4 − 20k 3 + 45k 2 + 8k − 12   ∆ 6 y−3 + ∆ 6 y−2  

+   + ... ...(3.30)
 6!  2  
dy
Equation (3.30) provides us the value of at any x which is not tabulated.
dx
Given x = a, you have k = 0
∴ Given k = 0 in Equation (3.30), you get,
92 Self-Instructional Material
Numerical Integration
1 1  ∆ y−1 + ∆ y0  1 3 1  ∆ y−2 + ∆ y−1 
2 2 4 4
 dy  and Differential
  =  0
∆y −   + ∆ y −1 +   Interpolation
 dx  x =a h  2 2  12 12  2 
1  ∆ y−3 + ∆ y−2 
6 6
1 5 NOTES
− ∆ y−2 −  
120 60  2 
...(3.31)
Differentiating Equation (3.30) with respect to x, you get,
d2y d  dy  d  dy  dk
2 =  =  
dx dx  dx  dk  dx  dx
1  ∆ y−1 + ∆ y0   2k − 1  3
2 2
=  +  ∆ y−1
h 2  2   2 
 6k 2 − 6k − 1   ∆ 4 y−2 + ∆ 4 y−1   4k 3 − 6k 2 + 1  5
+  +  ∆ y−2
 12  2   24 
 15k 4 − 30k 3 − 30k 2 + 45k + 4   ∆ 6 y−3 + ∆ 6 y−2  
+   + ...
 360  2  
...(3.32)
Given k = 0 in Equation (3.32), you get,
 d2y  1  ∆ y−1 + ∆ y0  1 3 1  ∆ y−2 + ∆ y−1 
2 2 4 4
  =   − ∆ y −1 −  
 dx 2  x = a h 2  2  2 12  2 

1  ∆ y−3 + ∆ y−2 
6 6
1 5
+ ∆ y−2 +   + ...
24 90  2 
...(3.33)
and so on.

CHECK YOUR PROGRESS

1. Who proposed interpolation formulae? Why are they used?
2. How is forward interpolation expressed?
3. How is backward difference denoted?
4. Differentiate between forward difference and backward difference.
5. Define central difference operator.
6. Write Newton's forward difference interpolation formula.
7. Write Newton's backward difference interpolation formula.
8. How are formulae obtained computing higher derivatives?
9. Write Stirling's central difference interpolation formula.

Self-Instructional Material 93
Numerical Integration
and Differential 3.3 DIFFERENTIATING A TABULATED FUNCTION
Interpolation

You know that Newton’s forward difference interpolation formula is,

NOTES k ( k − 1) 2 k ( k − 1)( k − 2) 3
y = y0 + k ∆ y0 + ∆ y0 + ∆ y0 + ... ...(3.34)
2! 3!
Differentiating with respect to k, you have

 2k − 1  ∆ 2 y +  3k − 6k + 2  ∆3 y + ...
2
dy
= ∆y0 +   0   0 ...(3.35)
dk  2   6 
dy
For maxima or minima, = 0. Hence, by equating Equation (3.35) to zero (i.e.,
dk
right hand side of Equation (3.35)) and retaining up to third difference, you have,
 3k 2 − 6k + 2  3
∆y0 + 
2k − 1  2
 ∆ y0 
+  ∆ y0 = 0
 2   6 

 1 ∆3 y  k 2 + (∆ 2 y − ∆ 3 y )k +  ∆y − 1 ∆ 2 y + 1 ∆3 y  = 0 ...(3.36)
or  0 0 0  0 0 0
2   2 3 
Which is a quadratic equation in k. Thus, substitute the values of ∆y0, ∆2y0 and
∆3y0 from the difference table and solve Equation (3.36) for k. The corresponding
values of x are given by x = x0 + kh, at which y is maximum or minimum.
Example 3.1: Find f ′(1.0), f ′′(1.0), f ′(2.0) and f ′′(2.0) from the following table:

x : 1.0 1.2 1.4 1.6 1.8 2.0

f ( x) : 0.0 0.1280 0.5540 1.2960 2.4320 4.000

Solution: The difference table is given below:

Difference Table

x f(x) = y ∆y ∆2 y ∆3 y ∆4 y ∆5 y
1.0 0.0
0.1280 Forw
ard d
1.2 0.1280 0.298 irecti
on
0.4260 0.018
1.4 0.5540 0.316 0.06
0.7420 0.078 –0.1
1.6 1.2960 0.394 –0.04
1.1360 0.038
ion
1.8 2.4320 0.432 r d direct
wa
1.5680 Back
2.0 4.0000

94 Self-Instructional Material
Since you have to find f ′(x) and f ′′(x) at the start of the table, you will Numerical Integration
and Differential
differentiate Newton’s forward difference formula at x = x0. Interpolation
You know that,
 dy  1  ∆y − 1 ∆ 2 y + 1 ∆ 4 y − 1 ∆5 y + ... NOTES
  =  0 2 
 dx  x = x0 h
0
3
0
4
0

 d2y 
and,  2 = 1  ∆ 2 y − ∆3 y + 11 ∆ 4 y − 5 ∆5 y
 dx  x = x0 h2  0 0
12
0
6
0

∆ y0 − ...
137 6
+ 
180
Thus,

1 0.1280 − 1 × 0.298 + 1 × 0.018 − 1 × 0.06 + 1 × (−0.1) 

f ′(1.0) =  
h 2 3 4 5
Given that, h = 0.2,
1
Hence, f ′(1.0) = × (−0.0500) = − 0.2500
0.2
Similarly,

f ′′(1.0) = 1 0.298 − 0.018 + 11 × (0.06) − 5 × (−0.1) 

(0.2) 2  12 6 

= 10.4583
Now to find f ′(2.0) and f ′′(2.0) you will differentiate Newton’s backward
interpolation formula at x = xn. You get,

 dy  1 1 2 1 3 
 
 dx  x = xn = h ∇yn + 2 ∇ yn + 3 ∇ yn + ...

 d2y 
and,  2 
1
= 2 ∇2 y + ∇3 y + 11 ∇4 y + 5 ∇5 y + 137 ∇6 y + ...
 dx  x = xn h  n n
12
n
6
n
180
n 
Thus,
1 
1.5680 + × 0.432 + × 0.038 + × (−0.04) + × (−0.1) 
1 1 1 1
f ′(2.0) =  
0.2  2 3 4 5
= 8.8333
1 
0.432 + 0.038 + × (−0.04) + × (−0.1) 
11 5
and f ′′(2.0) = 2 
(0.2)  12 6 
= 8.7500

Self-Instructional Material 95
Numerical Integration Example 3.2: Obtain the value of f ′(0.04) using Bessel’s formula from the
and Differential
Interpolation following .
x : 0.01 0.02 0.03 0.04 0.05 0.06
f(x) : 0.1023 0.1047 0.1071 0.1096 0.1122 0.1148.
NOTES
x − x0
Solution: Using Bessel’s formula at x0 = 0.04 with h = 0.01 and k =
h
x − 0.04
=
0.01

[f ′(x)]x = 0.04 =
1  ∆y − 1 (∆ 2 y + ∆ 2 y ) + 1 ∆ 3 y
h  0 4 −1 0
12
−1

1 1 5
+ (∆ 4 y−2 + ∆ 4 y−1 ) − ∆ y−2
24 120

(∆ 6 y−3 + ∆ 6 y−2 ) + ...

1
– ...(1)
240 
The difference table is as shown in below:
Difference Table

x k f(x) = y ∆y ∆2 y ∆3 y ∆4 y ∆5 y
0.01 –3 0.1023
0.0024
0.02 –2 0.1047 0.0000
0.0024 0.0001
0.03 –1 0.1071 0.0001 –0.0001
0.0025 0.000 0.0000
0.04 0 0.1096 0.0001 –0.0001
0.0026 –0.0001
0.05 1 0.1122 0.000
0.0026
0.06 2 0.1148

By using the above shown values in the Equation (1), you have,
1  1 1
[f ′(x)]0.04=  0.0026 − × (0.0001 + 0.000) + × (−0.001
0.01  4 12
1 
+ × (−0.0001 + 0) 
24 
1
= [0.0026 – 0.000025 – 0.000008333 – 0.000004167]
0.01
1
= [0.0025625] = 0.25625
0.01

96 Self-Instructional Material
Example 3.3: Given are the following pairs of values of x and y: Numerical Integration
and Differential
Interpolation
x: 1 2 4 8 10
y: 0 1 5 21 27
NOTES
 dy 
Determine   at x = 4 numerically.
 dx 
Solution: Since the values are not equispaced you will have to use Newton’s
divided difference formula. The difference table is shown below:
Difference Table for Newton’s Divided Difference Formula.

x y ∆y ∆2 y ∆3 y ∆4 y
1 0
1
2 1 1/3
2 0
4 5 1/3
4 –1/16 –1/144
8 21 –1/6
3
10 27

Newton’s divided difference formula is,

f(x) = y0 + (x – x0). ∆y0 + (x – x0) (x – x1) ∆2y0
+ (x – x0) (x – x1) (x – x2) ∆3y0 + (x – x0) (x – x1)
× (x – x2) (x – x3) ∆4y0 + ... ...(1)
By substituting the corresponding values from the above given Table in
Equation (1), you have
1
f(x) = 0 + (x – 1) × 1 + (x – 1) (x – 2) ×
3
+ (x – 1) (x – 2) (x – 4) × 0

 1 
+ (x – 1) (x – 2) (x – 4) (x – 8) ×  − 
 144 

 x2 − 3x + 2 
= 0 + (x – 1) +  +0
 3 

 x 4 − 15 x3 + 70 x 2 − 120 x + 64 
– 
 144 

Self-Instructional Material 97
Numerical Integration Now differentiating f(x) with respect to x, you have,
and Differential

2 x − 3   4 x3 − 45 x 2 + 140 x − 120 
Interpolation
f ′(x) = 1 +  + 
 3   144 
NOTES Now f ′(x) at x = 4,
 8 − 3  +  256 − 720 + 560 − 120 
[f ′(x)]4 = 1 +    
 3   144 

5 24
= 1+ − = 2.5
3 144
Example 3.4: From the following table find the minimum and maximum value of
the function.
x: 0 1 2 3 4 5
f(x): 0 0.25 0 2.25 16.0 56.25
Solution: The forward difference table is given below (as x are equispaced)
Forward Difference Table with Minimum and Maximum Value

x f(x) ∆f(x) ∆2 f(x) ∆3 f(x) ∆4 f(x) ∆5 f(x)

0 0

0.25

1 0.25 − 0.5

–0.25 3.0

2 0 2.5 6

2.25 9.0 0
3 2.25 11.5 6
13.75 15.0
4 16.0 26.5
40.25
5 56.25
Newton’s forward difference formula is
k ( k − 1) 2
f(x) = f ( x0 ) + k ∆f ( x0 ) + ∆ f ( x0 ) + ... ...(1)
2!
Differentiating with respect to k, you have,
df ( x ) (2k − 1) 2  3k 2 − 6k + 2  3
= ∆f ( x0 ) + ∆ f ( x0 ) +   ∆ f ( x0 ) + ...
dk 2  6 
...(2)
98 Self-Instructional Material
Numerical Integration
df ( x ) and Differential
For maxima and minima, = 0. Interpolation
dk
Hence, equating the right-hand side of Equation (2) to zero and retaining
only upto third differences, you obtain NOTES

 2k − 1  2  3k 2 − 6k + 2  3
∆f ( x0 ) +   ∆ f ( x0 ) +   ∆ f ( x0 ) = 0
 2   6 

 1 ∆3 f ( x )  k 2 +  ∆ 2 f ( x ) − ∆3 f ( x )  k
or,  0 
 2
0  0


 1 2 1 3 
+  ∆f ( x0 ) − ∆ f ( x0 ) + ∆ f ( x0 )  = 0
 2 3 
Substituting these values from Table, you have,

1  2  1 1 
 × 3  k + (−0.5 − 3)k + 0.25 − × (− 0.5) + × (3.0) = 0
2   2 3 

3 2
or, k − 3.5k + 1.5 = 0
2
Solving for k, you get

3
−(−3.5) ± (3.5)2 − 4 × ×1.5
2
k =
3
2×
2

3.5 ± (3.5)2 − 9 3.5 ± 1.803

= =
3 3
so, k = 1.7676, 0.5656
Now, x = x0 + kh = 1.7676 as x0 = 0 and h = 1
and, x = x0 + kh = 0.5656
for, k = 1.7676 and k = 0.5656 respectively.
The values of the function f(x) at k = 1.7676 and k = 0.5656, respectively
are,
[ f(x)]x = 1.7676
= 0.042187
and, [f(x)]x = 0.5656
= 0.16455

Self-Instructional Material 99
Numerical Integration
and Differential d2 f  d2 f 
Interpolation Now you will find . If  2  > 0 at k, then the value of the function
dx 2  dk 

NOTES  d2 f 
at that k is minimum and if  2  < 0 at k, then the value of the function is
 dk 
maximum at that k. Thus,

d2 f 2 6k − 6 3 12k 2 − 36k + 22
2 = ∆ f ( x0 ) + ∆ f ( x0 ) + × ∆ 4 f ( x0 ) + ...
dk 6 24
at k = 1.7676,
d2 f 6 × 1.7676 − 6
2 = (–0.5) +
× (3) +
dk 6
12 × (1.7676)2 − 36 × (1.7676) + 22
× 6 = 0.7676
24
This is positive. Thus, the value of the function at x = 1.7676 is minimum.
Also at k = 0.5656
d2 f
= (–0.5) + (0.5656 – 1) × (3)
dk 2
 12 × (0.5656) 2 − 36 × (0.5656) + 22 
+ × 6
 24 
= –0.4339
 d2 f 
As the value  2  at k = 0.5656 is negative, hence the value of the function at
 dk 
x = x0 + kh = 0.5656 is maximum.

CHECK YOUR PROGRESS

10. Write Newton's divided difference formula.
11. When is the value of the function minimum and maximum at k?

3.4 DIFFERENTIATING A GRAPHICAL FUNCTION

Differentiating a function numerically is done by using finite differences. A function

can also be differentiated graphically. Every function of x, f(x), has a graph. Slope
at any point on this graph, shows its derivative at that point. Thus, derivative f ′(x),

100 Self-Instructional Material

of function f(x) with respect to x, shows tan of the angle that it makes with the Numerical Integration
and Differential
positive x-axis. In the figure below, tangent at point a makes an angle 180 – α Interpolation
with positive x-axis. Gradient is given by tan (180 – α) = – tan α. Usually, slope
or gradient is shown as letter m.
NOTES
y

3
Graph of f(x)
2

1
x

0 1.5
0.5 1 2 2.5 3
a

Thus, there are two basic facts on derivative of a function:

1. Value of f ′(a) which is the value of f ′(x) at x = a shows a slope of the
graph of f(x) at x = a.
2. Derivative of a function is also a function of x and slope at any point on this
graph depends on x-coordinate of this point.
Keeping in mind these two basic facts, we can draw the graph of derivative
f ′(x) from the graph of f(x).
Graph of the Derivative
Graph of a function f(x) is shown below.

3
Graph of f(x)
2

1
x
0 1.5
0.5 1 2 2.5 3

As we know, f ′(a) shows slope of the tangent at point (x, f(x)) on this
graph f(x). For sketching graph of the derivative of f(x), we select few points on
this graph. We take values of x at 0, 0.5, 1.0, 1.5, 2.0, 2.5 and 3.0 and make a
table by finding slopes of tangents at these points. The table is drawn as below:

Self-Instructional Material 101

Numerical Integration
and Differential
Interpolation
x 0 0.5 1 1.5 2 2.5 3

m = f'(x) 3 0 –4 –3 0 1 0
NOTES
For measuring exact slope, use a ruler and a grid. But in the absence of a
ruler and a grid we may make a rough sketch. Now with these tabulated values a
graph can be made which is shown below.

y
3
Graph of f (x)
2

1
x
0 0.5 1 2 3
1.5 2.5
–1

–2

–3

–4

With the help of this graph of derivative, one can find derivative at any
point. Also, one may note that the graph intersects x-axis at points corresponding
to low and high points on graph of the function f(x). By taking more points, more
accurate graph can be drawn.

3.5 NUMERICAL INTEGRATION INTERPOLATION

FORMULAE

The process of evaluating a definite integral from a set of tabulated values (xi, yi)
of the integrand (or function) f(x) is known as quadrature, when applied to a
function of single variable.
Numerical integration of a function f(x) is done by representing f(x) as an
interpolation formula and then integrating it between the prescribed limits. By this
approach, a quadrature formula is obtained for approximate integration of the
function f(x).
3.5.1 Newton–Cotes Quadrature Formulae
b
Let I= ∫ f ( x) dx. ...(3.37)
a

102 Self-Instructional Material

where f(x) takes values y0, y1, ..., yn for x = x0, x1, ..., xn (refer Figure 3.1) Numerical Integration
and Differential
Y Interpolation

y = f(x)
NOTES

y0 y1 y2 yn

O x0 x0 + h x0 + 2h X
x0 + nh

Figure 3.1 Graphical Representation of Equation 3.37

We divide the internal (a, b) into n-equal parts of width h such that,
a = x0, x1 = x0 + h, x2 = x0 + 2h, ..., xn = x0 + nh = b
Then,
x0 + nh
I= ∫ x0 f ( x) dx.

Put x = x0 + kh ⇒ dx = h.dk
When x = x0 ⇒ k = 0
and when x = xn ⇒ k = n
n n
so, I= ∫ f ( x0 + kh). h . dk = ∫ yk .dk
0 0
Replacing f (x0 + kh) or yk by Newton’s forward difference interpolation
formula, you obtain
n
 k (k − 1) 2 k (k − 1) (k − 2) 3 
I = h ∫  y0 + k .∆y0 + ∆ y0 + ∆ y0 + ... dk
0
 2! 3! 

Integrating term-by-term, you obtain

 n 1  n2 n  2 1  n3 2  3
I = nh  y0 + ∆y0 +  −  ∆ y0 +  − n + n  ∆ y0
 2 2 3 2 6 4 

1  n 4 3n3 11n 2  
+  − + − 3n  ∆ 4 y0 + ... .
24  5 2 3  
This is known as Newton–Cote’s quadrature formula. By substituting
n = 1, 2, 3, ... you will deduce the following important quadrature formulae.

Self-Instructional Material 103

Numerical Integration 3.5.2 Trapezoidal Rule (n = 1)
and Differential
Interpolation
Putting n = 1 in Equation (3.37) and taking the curve as a linear polynomial or
straight line through (x0, y0) to (x1, y1) so that the differences of higher orders
NOTES become zero, you get,

f ( x) dx = h  y0 + ∆y0 
x0 + h 1
∫x
x1

0
f ( x) dx = ∫x
0  2 
h
= ( y0 + y1 ) {As ∆y0 = y1 – y0}
2
Similarly from (x1, y1) to (x2, y2),

f ( x) dx = h  y1 + ∆y1  = ( y1 + y2 )
x0 + 2 h 1
∫x + h
h
{As ∆y1 = y2 – y1}
0  2  2
and so on,
x0 + nh
∫x + (n − 1) h f ( x) dx =
h
( yn − 1 + yn )
0 2
Adding these n-integrals, you have,
x0 + nh
∫x0
f ( x) dx = h  y0 + 2( y1 + y2 + ... + yn −1 ) + yn 
2 
This is known as trapezoidal rule.

1
3.5.3 Simpson’s Rule (n = 2)
3
By fitting a parabola through the points (x0, y0), (x1, y1) and (x2, y2), i.e., putting
n = 2 such that the differences of order higher than second vanish, you obtain,
x0 + 2 h  1  h
I= ∫x0 f ( x) dx = 2h  y0 + ∆y0 + ∆ 2 y0  = [ y0 + 4 y1 + y2 ]
 6  3
 As ∆y0 = y1 − y0 
 2 
 ∆ y0 = y2 − 2 y1 + y0 

x0 + 4 h h
Similarly, ∫x0 +2h f ( x) dx =
3
[ y2 + 4 y3 + y4 ]

x0 + nh h
and so on ∫x0 +(n−2)h f ( x) dx =
3
[ y n − 2 + 4 yn − 1 + y n ]

104 Self-Instructional Material

Adding all these integrals, you have (when n is even) Numerical Integration
and Differential
x0 + nh Interpolation
∫x0
h
f ( x) dx =
[y + 4(y1 + y3 + y5 + ... + yn – 1)
3 0
+ 2(y2 + y4 + ... + yn – 2) + yn] NOTES
1
This is known as Simpson’s rule and is most commonly used.
3
3.5.4 Boole’s Rule (n = 4)
Putting n = 4 in Newton–Cote’s quadrature formula and neglecting all higher order
differences other than 4, you have,
x0 + 4 h 4 k (k − 1) 2
∫x0 f ( x) dx = h ∫  y0 + k ∆y0 + ∆ y0
0  2!

k (k − 1) (k − 2) 3 k (k − 1) (k − 2) (k − 3) 4 
+ ∆ y0 + ∆ y0  dk
3! 4! 
(By Newton’s forward interpolation formula)

 n n(2n − 3) 2 n(n − 2)2 3

= 4h  y0 + ∆y0 + ∆ y0 + ∆ y0
 2 12 24
4
 n 4 3n3 11n 2  ∆ 4 y0 
+ − + − 3n  
 5 2 3  4!  0

 5 3 7 
= 4h  y0 + 2∆y0 + ∆ 2 y0 + ∆ 3 y0 + ∆ 4 y0 
 3 2 90 

2h
= (7 y0 + 32 y1 + 12 y2 + 32 y3 + 7 y4 )
45
x0 +8h
Similarly, ∫x0 +4h f ( x) dx = 2h (7y4 + 32y5 + 12y6 + 32y7 + 7y8) and
45
so on.
Adding all these integrals from x0 to x0 + nh, where n is a multiple of 4, you
get,

x0 + nh 2h
∫x0 f ( x) dx =
45
[7 y0 + 32 y1 + 12 y2 + 32 y3 + 14 y4 +

32 y5 + 12 y6 + 32 y7 + 7 y8 + ...]

Self-Instructional Material 105

Numerical Integration This is known as Boole’s rule.
and Differential
Interpolation While applying Boole’s rule, the number of subintervals should be taken as
a multiple of 4.
NOTES 3.5.5 Weddle’s Rule (n = 6)
Putting n = 6 in Newton–Cote’s quadrature formula and neglecting all differences
of order higher than six, you get,
x0 + 6 h
∫x0 f ( x) dx

6 k (k − 1) 2 k (k − 1) (k − 2) 3
= h ∫  y0 + k ∆y0 + ∆ y0 + ∆ y0
0  2! 3!
k ( k − 1) ( k − 2) ( k − 3) 4 k ( k − 1) ( k − 2) ( k − 3) ( k − 4) 5
+ ∆ y0 + ∆ y0
4! 5!
k (k − 1) (k − 2) (k − 3) (k − 4) (k − 5) 6 
+ + ∆ y0  dk
6! 

 k2 1  k3 k2  2 1  k4 3 2 3
= h  ky0 + ∆y0 +  −  ∆ y0 +  − k + k  ∆ y0
 2 2 3 2  6 4 

1  k 5 3k 4 11k 3 
+  − + − 3k 2  ∆ 4 y0
24  5 2 3 

1  k6 5 35k
4
50k 3 
+  − 2k + − + 12k 2  ∆5 y0
120  6 4 3 

6
1  k 7 5k 6 225k 4 274k 3  
+  − + 17 k 5 − + − 60k 2  ∆ 6 y0 
720  7 2 4 3  0

 9 2 3 41 4
= 6h  y0 + 3∆y0 + ∆ y0 + 4∆ y0 + ∆ y0
 2 20
11 5 41 6 
+ ∆ y0 + ∆ y0 
20 840 
6h  2 3 4
=  20 y0 + 60∆y0 + 90∆ y0 + 80∆ y0 + 41∆ y0
20
41 6 
+ 11 ∆5 y0 + ∆ y0 
42 

106 Self-Instructional Material

Numerical Integration
3h
= [20 y0 + 60( y1 − y0 ) + 90( y2 − 2 y1 − y0 ) and Differential
10 Interpolation

+ 80(y3 – 3y2 + 3y1 – y0) + 41(y4 – 4y3 + 6y2 – 4y1 + y0)

+ 11(y5 – 5y4 + 10y3 – 10y2 + 5y1 – y0) NOTES
 41 
+ (y6 – 6y5 + 15y4 – 20y3 + 15y2 – 6y1 + y0)] Q 42 ~ 1

3h
= [ y0 + 5 y1 + y2 + 6 y3 + y4 + 5 y5 + y6 ]
10
Similarly,
x0 +12 h 3h
∫x0 +6h f ( x) dx =
10
[ y6 + 5 y7 + y8 + 6 y9 + y10 + 5 y11 + y12 ]
.
.
.
x0 + nh 3h
∫x0 +(n − 6)h f ( x) dx = 10 [ yn − 6 + 5 yn − 5 + yn − 4 + 6 yn − 3 + yn − 2
+ 5 yn −1 + yn ]
Adding the above integrals, you get,
x0 + nh 3h
∫x0 f ( x) dx =
10
[ y0 + 5 y1 + y2 + 6 y3 + y4 + 5 y5 + 2 y6

+ 5 y7 + y8 + 6 y9 + y10 + 5 y11 + 2 y12 + ...]

which is known as Weddle’s rule. Here n must be a multiple of 6.
Example 3.5: The velocities of a car running on a straight road at intervals of
2 minutes are given as follows :
Time (in min): 0 2 4 6 8 10 12
Velocity (in km/hr): 0 22 30 27 18 7 0

1
Use (i) Trapezoidal rule (ii) Simpson’s rule (iii) Weddle’s rule to find the
3
distance covered by the car in 12 minutes.
Solution: Let k be the distance travelled by the car in t min.
dk
then, = v, where v is the velocity of the car.
dt
12
so, [ k ]tt == 12
0 = ∫0 v. dt
Here n = 6. so by

Self-Instructional Material 107

Numerical Integration (i) By Trapezoidal rule:
and Differential
Interpolation 12 h
∫0 v. dt =
2
[v0 + v6 + 2(v1 + v2 + ... + v5 )]
NOTES 2
= [0 + 0 + 2(22 + 30 + 27 + 18 + 7)]
2
= 208 km
1
(ii) By Simpson’s rule:
3

12 h
∫0 v. dt =
3
[v0 + v6 + 2(v2 + v4 ) + 4(v1 + v3 + v5 )]

2
= [0 + 0 + 2(30 + 18) + 4(22 + 27 + 7)] = 213.33 km
3
(iii) By Weddle’s rule:
12 3h
∫0 v. dt =
10
[v0 + 5v1 + v2 + 6v3 + v4 + 5v5 + v6 ]

3× 2
= [0 + 5 × 22 + 30 + 6 × 27 + 18 + 5 × 7 + 0]
10
= 213.00 km
Example 3.6: A curve is given by the following table:

x: 0 1 2 3 4 5 6
y: 0 2 2.5 2.3 2 1.7 1.5

The x-coordinate of the area is bounded by the curve. The end coordinates
6
and the x-axis is given by Ax = ∫0 xy dx, where A is the area. Find x using

1
Simpson’s rule.
3
6
Solution: Given A.x = ∫0 xy dx, where A is the area
6
∫0 xy dx
∴ x = 6
∫0 y dx
108 Self-Instructional Material
Numerical Integration
6
as A = ∫0 y dx is the area now, you will compute xy as shown in Table given and Differential
Interpolation

below.
Computing Values for x, y NOTES

x: 0 1 2 3 4 5 6
y: 0 2 2.5 2.3 2 1.7 1.5
X = xy: 0 2 5.0 6.9 8 8.5 9.0

1
Then, using Simpson’s rule (with h = 1)
3
6
∫0 xy dx
h
= [ X 0 + X 6 + 2( X 2 + X 4 ) + 4( X1 + X 3 + X 5 )]
3

1
= [0 + 9 + 2(5 + 8) + 4(2 + 6.9 + 8.5)]
3
= 34.8667
6 h
and, ∫0 ydx =
3
[ y0 + y6 + 2( y2 + y4 ) + 4( y1 + y3 + y5 )]

1
= [0 + 1.5 + 2(2.5 + 2) + 4(2 + 2.3 + 1.7)]
3
= 11.500

Thus, x =
∫0 xy dx = 34.8667 3.0319
6
∫0 y dx 11.50
Example 3.7: A river is 60 ft wide. The depth ‘d’ infect at a distance x ft from one
bank is given by the following table:
x: 0 10 20 30 40 50 60
d: 0 5 9 15 10 6 2
Find approximately the area of the cross-section by, (i) Trapezoidal rule
1
(ii) Simpson’s rule (iii) Weddle’s rule.
3
Solution: The required area of the cross-section of the river is given as,
60
∫0 d .dx

Here the number of subintervals is 6.

Self-Instructional Material 109
Numerical Integration (i) By Trapezoidal rule:
and Differential
Interpolation
60 h
∫0 d .dx =
2
[ d 0 + d 6 + 2( d1 + d 2 + d3 + d 4 + d5 )]

NOTES
10
= [(0 + 2) + 2(5 + 9 + 15 + 10 + 6)] = 460
2
Hence, the required area of the cross-section of the river using Trapezoidal
rule is 460 sq. m.
1
(ii) By Simpson’s rule:
3
60
∫0
h
d .dx = [( d 0 + d 6 ) + 4( d1 + d3 + d5 ) + 2( d 2 + d 4 )]
3
10
= [(0 + 2) + 4(5 + 15 + 6) + 2(9 + 10)]
3
= 480
Hence, the required area of the cross-section of the river using Simpson’s
1
rule is 480 sq.m.
3
(iii) By Weddle’s rule:
60 3h
∫0 d .dx =
10
[ d0 + 5d1 + d 2 + 6d3 + d 4 + 5d5 + d 6 ]

3 × 10
= [0 + 5 × 5 + 9 + 6 × 15 + 10 + 5 × 6 + 2]
10
= 3[25 + 9 + 90 + 10 + 30 + 2] = 498.
Thus, the required area of the cross-section of the river using Weddle’s rule
is 498 sq. m.
3.5.6 Gauss Quadrature Formulae
We have already discussed the Trapezoidal rule and Simpson’s rule of integration
of a function. Now, we will discuss the Gauss Quadrature rule to obtain the formula
for polynomial having certain maximum power.
A general quadrature rule in numerical analysis is an approximation of the definite
integral of a function. It is also stated as a weighted sum of function values at
specified points within the domain of integration.
The Gauss quadrature formula or Gaussion quadrature rule is named after Carl
Friedrich Gauss and is defined as a n-point quadrature rule that yields the exact
result of the polynomials of degree 2n –1 for n points xi and n weight wi. The
110 Self-Instructional Material
domain of integration for the rule is taken as [a, b] or [–1, 1]. Hence, the Gauss Numerical Integration
and Differential
quadrature rule is stated as, Interpolation

1 n

∫ f ( x ) dx ≈ ∑ wi f ( xi )
–1
i =1 NOTES
The evaluation points are just the roots of a polynomial which belong to a class
of orthogonal polynomial.
Legendre polynomials are taken as the associated polynomials for the integration
of Pn(x). The nth polynomial is normalized to give Pn(1) = 1, the ith Gauss node,
xi, is the ith root of Pn; and its weight wi is given by,
2
wi = (1 – x 2 ) ( P ' ( x ))2
i n i

Table 3.1 some low-order rules to solve the integration:

Table 3.1 Rules to Solve Integration

Number of points, n Points, xi Weights,wi

1 0 2
± 1/3 1
2
0 8/9
± 3/5
5/9
18– 30
3+ 2
36
3 ± 7
128 / 225
0
1 322 + 13 70 1 322 − 13 70
5 ± 5− ± 5+
3 900 3 900

Change of Interval for Gaussian Quadrature

An integral [a, b] is first changed into an integral [–1, 1] before we apply the
Gaussian quadrature rule. Hence,

b−a 1 b−a a+b 

∫ ∫
b
f (t )dt = f x+  dx
a 2 − 1
 2 2 
Using Gaussian quadrature rule, we get the approximation,

b−a n b–a a+b

∑
2 i =1
wi f 
 2
xi + 
2 

Self-Instructional Material 111

Numerical Integration Derivation of two-point Gauss Quadrature Rule
and Differential
Interpolation
The two-point Gauss Quadrature rule is basically an extension of the Trapezoidal
rule. When the variables of the function are not predefined as a and b, but are the
NOTES unknown variables x1 and x2, then according to Gauss Quadrature two-point rule,
the integral is approximated as,
b

I= ∫ f ( x)dx
a

≈ c1 f(x1) + c2 f(x2) (3.38)

Now, we have four unknown variables x1, x2, c1 and c2. These variables are
derived from the formula and gives the result for integrating a general third order
polynomial, f(x) = a0 + a1x + a2x2 + a3x3. Hence,
b b

∫ f ( x)dx = ∫ (a
a
0 + a1 x + a2 x 2 + a3 x 3 ) dx
a

b
 x2 x3 a4 
=  0
4  a
a x + a1 + a 2 + a3
 2 3

 b2 – a2   b3 – a 3   b4 – a4 
a
= 0 ( b − a ) + a1  2
+ a  3
+ a 
 2   3   4 
(3.39)
b

∴ ∫ f ( x)dx
a
= ≈ c1 f ( x1 ) + c2 f ( x2 )

= c1 ( a0 + a1 x1 + a2 x12 + a3 x13 ) + c2 ( a0 + a1 x2 + a2 x22 + a3 x23 )

(3.40)
In Equation (3.39) the constants a0, a1, a2, and a3 are the arbitrary variables,
so the coefficients of these variables are equal. Hence,
b – a = c1 + c2
b2 – a2
= c1x1 + c2x2
2
b3 – a 3
= c1 x12 + c2 x22
3
b4 – a4
= c1 x13 + c2 x23
4
The above mentioned simultaneous nonlinear equations will have one real
solution.

112 Self-Instructional Material

Numerical Integration
b−a and Differential
∴ c1 =
2 Interpolation

b−a
c2 = NOTES
2

 b − a  1  b + a
x1 =   – + 2
 2  3

 b − a  1  b + a
x2 =   + 2
 2  3 
Substituting these values in the equation (9.9), we get,
b

∫ f ( x)dx
a
= ≈ c1 f ( x1 ) + c2 f ( x2 )

b−a b−a 1  b+a b−a b−a 1  b+a

= f − + 2 + 2 f  2  + 2 
2  2  3     3  
3.5.7 Errors in Quadrature Formulae
The error is given as
Error = Actual value – Approximate value
In case of quadrature formula the error is given as
E = Actual Integration – Approximate Integration
b b
= ∫a y dx − ∫ P ( x).dx
a

where P(x) is the polynomial representing the function y = f(x) in the given
interval [a, b].
Table 3.2 gives the error in each of the rule of numerical integration.
Table 3.2 Error in each Rule of Numerical Integration

Rule Error (E) Order of E

(b − a)h2 ii
Trapeziod Rule E=– y h2
12
1 (b − a)h4 iv
Simpson’s Rule E=– y h4
3 180
8h7 vi
Boole’s Rule E=– y h7
945
h7 vi
Weddle’s Rule E=– y h7
140
Self-Instructional Material 113
Numerical Integration
and Differential
Interpolation CHECK YOUR PROGRESS
12. Define quadrature process.
NOTES
13. What is numerical integration?
14. Write Newton Cote's quadrature formula.
15. Write the trapezoidal rule.
16. Write Boole's rule.
17. Write Weddle's rule.

3.6 APPROXIMATION THEORY

Approximation theory deals with two types of problem.

• When a function is given explicitly and you want to find a simpler type such
as polynomial for representation.
• The problem concerns fitting function to a given data and finding the best
function in certain class that is used to represent the data.
The major aim of approximation of functions is to represent a function with
a minimum error as this is a central problem in the software development. As you
know computers are essentially arithmetic devices, the most elaborate function
they can compute is rational function, a ratio of polynomials. There are two ways
to approximate a function by a polynomial:
1. Using truncated Taylor’s Series
2. Using Chebyshev polynomials
Taylor’s Series Representation: Let f(x) be a function and it has upto (n + 1)th
derivatives in an interval [a, b], then it may be expressed near x = x0 in [a, b] as
( x − x0 ) 2
f(x) = f ( x0 ) + ( x − x0 ) f ′( x0 ) + f ′′( x0 ) + ...
2!

( x − x0 )n n ( x − x0 )n + 1 ( n + 1)
+ ... + f ( x0 ) + ·f (k )
n! (n + 1)!

where f ′( x0 ), f ′′( x0 ) ... f n ( x0 ) are the derivatives of f(x) evaluated at x0.

( x − x0 ) n + 1 f n + 1 (k )
The term is called the ‘remainder term’. Here, k is a
(n + 1)!
function of x and lies between x and x0, i.e., x ≤ k ≤ x0.

114 Self-Instructional Material

This remainder term gives Truncation Error if only the first n-terms in Taylor Numerical Integration
and Differential
series are used to represent the function. Interpolation
Hence,

( n + 1) ( x − x0 ) n + 1 | ( x − x0 ) n + 1 | NOTES
Truncation Error = f ( k ) . ≤ M
( n + 1)! (n + 1)!

where, M = max f (nk )+ 1 for x in [a, b].

Example 3.8: Give a Taylor series representation of f(x) = sin x and compute sin
x correct to three significant digits.
Solution: The Taylor series representation of sin x is (about x0 = 0)
( x − x0 ) 2
f(x) = f ( x0 ) + f ′( x0 ) ( x − x0 ) + f ′′( x0 ) + ...
2!
x2 x3
= sin (0) + cos (0) ( x) – sin (0). – cos (0) + ...
2! 3!

x3 x 5 x 7
Thus, sin x = x – + − + ...
3! 5! 7!

x3 x5 x7
or, sin x = x − + − + ... ...(1)
6 120 5040
Since you require sin x correct to three significant digits and the Truncation
error after three terms of equation is
1
T.E. ≤ = 0.000198
5040
Thus, you truncate the series after three terms.
x3 x5
So, sin x = x − +
3! 5!
is the required representation to obtain the value of sin x correct to three significant
digits.
Example 3.9: Express the Taylor Series expansion of
x 2 x3 x 4
e− x = 1 − x + − + + ....
2! 3! 4!
in terms of Chebyshev polynomials.

Self-Instructional Material 115

Numerical Integration Solution: The Chebyshev polynomial representation of
and Differential
x 2 x3
Interpolation
e− x = 1 − x + − + ....
2! 3!
NOTES
−x 1 1
as e = T0 ( x ) − T1 ( x) + [T2 ( x ) + T0 ( x )] − [3T1 ( x ) + T3 ( x )]
4 24

1 1
+ [3T0 ( x ) + 4T2 ( x ) + T4 ( x )] − [10T1 ( x ) + 5T3 ( x ) + T5 ( x )] + ...
195 1920

Thus, e− x = 1.26606 T0 ( x) − 1.13021 T1 ( x) + 0.27148 T2 ( x)

−0.04427 T3 ( x) + 0.0054687 T4 ( x) −0.0005208 T5 ( x) + ... .

If you expand T0 ( x), T1 ( x), T2 ( x), T3 ( x), T4 ( x) and T5 ( x) using their
polynomial equivalents and truncate after six terms, you have,
e− x = 1.00045 – 1.000022x + 0.4991992x2 – 0.166488x3
+ 0.043794x4 – 0.008687x5
Comparing this representation with the given tayler series representation,
you can observe that there is a slight different in the coefficients of different powers
of x. The main advantage of this representation as a sum of Chebyshev polynomials
is that, for a given error bound, you can truncate the series with a lesser terms
compared to Taylor series. Also, the error is more uniformly distributed for various
arguments. The possibility of a series with a lower number of terms is called
economization of power series. The maximum error in the six terms of Chebyshev
−x
representation of e is 0.00045 whereas the error in the six terms of Taylor
−x
series representation of e is 0.0014. Thus, you have to add one more term in
Taylor series to ensure that the error is less than that in the Chebyshev approximation.
x3 x 5
Example 3.10: Represent sin x = x − + ... using Chebyshev polynomial
3! 5!
to obtain 3 significant digits accuracy in the computation of sin x.
Solution: Using Chebyshev polynomial, sin x is approximated as,
1 1
sin x = T1 ( x ) − [3T1 ( x) + T3 ( x)] + [10T1 ( x ) + 5T3 ( x) + T5 ( x )]
24 1920
~ 0.8802 T1 ( x) − 0.03906 T3 ( x) + 0.00052 T5 ( x)
As the coefficient of T5(x) is 0.00052 and as | T5(x) | ≤ 1 for all x, therefore
the truncation error if you omit the last term above still give 3 significant digit
accuracy.

116 Self-Instructional Material

Numerical Integration
and Differential
CHECK YOUR PROGRESS Interpolation

18. Which problem types are solved using approximation theory?

NOTES
19. What is the main aim of approximation function? How is a function
approximated using a polynomial?

3.7 SUMMARY

In this unit, you have learned that:

• Analytical methods are used to express various mathematical functions.
• Sir Isaac Newton proposed Interpolation formulae for forward and
backward interpolation.
• Forward interpolation is expressed as a finite difference identity.
• Forward difference is shown by using ∆ which is known as forward
difference operator.
• Backward difference are denoted by backward difference operator ∇ and
is given as:
yn – 1 = yn – ∇yn ⇒ ∇yn = yn – yn – 1 and yn – 1 = (1 – ∇)yn
• Central difference operator is defined as δf(x) = f(x + h/2) – f(x – h/2).
• Quadrature is the process of evaluating a definite integral from a set of
tabulated values.
• Newton–Cote’s quadrature formula is divided into following important
quadrature formulae:
o Trapezoidal rule (n = 1)
o Simpson’s rule (n = 2)
o Simpson’s rule (n = 3)
o Boole’s Rule (n = 4)
o Weddle’s rule (n = 6)
• Approximation theory deals with two types of problems:
o When you want to find a simpler type such as polynomial for
representation.
o The problem concerns fitting function to given data and finding the best
function in certain class that is used to represent the data.
• There are two ways for approximating a function by a polynomial:
o Using truncated Taylor’s Series
o Using Chebyshev polynomials

Self-Instructional Material 117

Numerical Integration
and Differential 3.8 KEY TERMS
Interpolation

• Numerical differentiation:The process of computing or approximating

NOTES dny
the derivatives (n ≥ 1) of f(x) at some values of x from the given data
dx n
is known as numerical differentiation.
• Newton’s backward difference interpolation formula: It is
k ( k + 1) 2 k ( k + 1) ( k + 2) 3
y = yn + k ∇yn + ∇ yn + ∇ yn + ...
2! 3!
x − xn
where, k =
h

3.9 ANSWERS TO ‘CHECK YOUR PROGRESS’

1. Sir Isaac Newton had proposed Interpolation formulae for forward and
backward interpolation. These are used for numerical differentiation.
2. It is expressed as a finite difference identity from which an interpolated
value, in between tabulated points using first value y0 with powers of forward
difference is used. Forward difference is shown by using ∆ which is known
as forward difference, operator. Forward difference is defined as the value
obtained by subtracting the present value from the next value.
3. Backward difference is denoted by backward difference operator ∇ and is
given as:
yn – 1 = yn – ∇yn ⇒ ∇yn = yn – yn – 1 and yn – 1 = (1 – ∇)yn
4. In forward difference it was y0 whereas in backward difference operator it
is yn.
5. Central difference operator is defined as δf(x) = f(x + h/2) – f(x – h/2).
This is known as first central difference operator.
k ( k − 1) 2 k ( k − 1) ( k − 2) 3
6. y = y0 + k∆y0 + ∆ y0 + ∆ y 0 + ...
2! 3!
x−a
where, k=
h
k ( k + 1) 2 k ( k + 1) ( k + 2) 3
7. y = yn + k ∇yn + ∇ yn + ∇ yn + ...
2! 3!
x − xn
where, k =
h
8. Formulae for computing higher derivatives may be obtained by successive
differentiation.

118 Self-Instructional Material

Numerical Integration
k  ∆y + ∆y−1  k 2 2 k (k 2 − 12 )  ∆ 3 y−1 + ∆3 y−2  and Differential
9. y = y0 +  0  + ∆ y −1 +   Interpolation
1!  2  2! 3!  2 
k 2 (k 2 − 12 ) 4 k (k 2 − 12 ) (k 2 − 22 )  ∆5 y−2 + ∆5 y−3  NOTES
+ ∆ y−2 +   + ...
4! 5!  2 

x−a
where, k=
h
10. Newton’s divided difference formula is,
f(x) = y0 + (x – x0). ∆y0 + (x – x0) (x – x1) ∆2y0
+ (x – x0) (x – x1) (x – x2) ∆3y0 + (x – x0) (x – x1)
× (x – x2) (x – x3) ∆4y0 + ...
 d2 f 
11. If  2  > 0 at k, then the value of the function at that k is minimum and
 dk 
 d2 f 
if  2  < 0 at k, then the value of the function is maximum at that k.
 dk 
12. The process of evaluating a definite integral from a set of tabulated values
(xi, yi) of the integrand (or function) f(x) is known as quadrature, when
applied to a function of single variable.
13. Numerical integration of a function f(x) is done by representing f(x) an
interpolation formula and then integrating it between the prescribed limits.
By this approach, a quadrature formula is obtained for approximate
integration of the function f(x).
14. Newton Cote’s quadrature formula is as follows:
 n 1  n2 n  2 1  n3 2  3
I = nh  y0 + ∆y0 +  −  ∆ y0 +  − n + n  ∆ y0
 2 2 3 2 6 4 

1  n 4 3n3 11n 2  
+  − + − 3n  ∆ 4 y0 + ... .
24  5 2 3  
15. The trapzoidal is as follows:
x + nh
0 h
f ( x) dx = y2 + 2 ( y1 + y2 + ... + yn – 1 ) + yn
x 0 2
x0 + nh 2h
16. Boole’s rule ∫ x0 f ( x) dx =
45
[7 y0 + 32 y1 + 12 y2 + 32 y3 + 14 y4 +

32 y5 + 12 y6 + 32 y7 + 7 y8 + ...]
17. Weddle’s rule
x0 + nh 3h
∫ x0 f ( x) dx =
10
[ y0 + 5 y1 + y2 + 6 y3 + y4 + 5 y5 + 2 y6

+ 5 y7 + y8 + 6 y9 + y10 + 5 y11 + 2 y12 + ...]

Self-Instructional Material 119
Numerical Integration 18. Approximation theory deals with two types of problem:
and Differential
Interpolation • When a function is given explicitly and you want to find a simpler type
such as polynomial for representation.
• The problem concerns fitting function to a given data and finding the
NOTES best function in certain class that is used to represent the data.
19. The major aim of approximation of functions is to represent a function with
a minimum error as this is a central problem in the software development.
There are two ways to approximate a function by a polynomial:
• Using truncated Taylor’s Series
• Using Chebyhsev polynomials

3.10 QUESTIONS AND EXERCISES

Short-Answer Questions
1. What is trapezoid rule?
1
2. What is Simpson’s rule?
3
3. What is Weddle’s rule?
21 1
4. Evaluate ∫1
x
dx by Simpson’s
3
rule with n = 4 and determine the error

by direct integration.

Long-Answer Questions
1. The population of a certain town is shown in the table below:

Year: 1951 1961 1971 1981 1991

Population (in thousand) : 19.96 39.65 58.81 77.21 94.61

Find the rate of growth of population in 1981.

dp
(Hint: The rate of growth of population with respect to year is where
dy
p is for population and y stands for year.)
2. From the values in the table given below, find the value of sec 31° using
numerical differentiation.
θ° : 31 32 33 34
tan θ° : 0.6008 0.6249 0.6494 0.6745

(Hint: d (tan θ) = sec θ. )

dθ

120 Self-Instructional Material

3. Find (i) y′(0), y′′(0) (ii) y′(2), y′′(2) and (iii) y′(3.5), y′′(3.5) from the following Numerical Integration
and Differential
table: Interpolation

x: 0 0.5 1.5 2 2.5 3 3.5

y: 4 6 8 15 20 24 30 NOTES

4. Find f ′(10) and f ′′(10) from the following table:

x: 3 5 11 27 34
f ( x) : − 13 23 899 17315 35606

5. Using Bessel’s formula, find f ′(7.5) from the following table:

x: 7.47 7.48 7.49 7.50 7.51 7.52 7.53

y: 0.193 0.195 0.198 0.201 0.203 0.206 0.208

6. From the table given below, find for what value of x, y is maximum. Also
find this value of y.

x: 3 4 5 6 7 8 9
y: 0.205 0.240 0.259 0.262 0.250 0.224 0.201

7. Using the following data find the value of x for which y is minimum. Find this
value of y also.

x: 0.60 0.65 0.70 0.75

y: 0.6221 0.6155 0.6138 0.6170

8. From the following given data find the maximum value of y.

x: −1 1 2 3
y: − 21 15 12 3

Also find the value of x for which y is maximum.

π/2
9. Evaluate the integral ∫0 cos θ d θ by dividing the interval into 6 parts,
1
using (i) Trapezoidal rule (ii) Simpson’s
rule (iii) Weddle’s rule.
3
π/2 1
10. Evaluate ∫ log10 sin x dx by Simpson’s – rule by dividing the interval
π/6 3
into 6 parts.
5.2
11. Evaluate ∫4 log e x dx using,

(i) Trapezoidal rule (ii) Weddle’s rule

Self-Instructional Material 121

Numerical Integration 12. The velocities of a car running on a straight road at intervals of 2 minutes are
and Differential
Interpolation given as follows:

Time (in minutes ): 0 2 4 6 8 10 12 14 16

NOTES Velocity (in km/hr ): 0 25 35 29 15 9 5 3 0
Apply Simpson’s rule to find the distance covered by the car.
1
13. Evaluate ∫0 cos x dx by Trapezoidal rule. Take h = 0.2.
6 x
14. Evaluate ∫0 e dx by (i) Simpson’s rule (ii) Weddle’s rule and compare it
with the actual values.
π/2
15. Find the value of ∫ 1 − 0.162 sin 2 x . dx using Simpson’s 1 rule taking
0 3
6 subintervals.
16. A curve is drawn using the following given table to pass through the points:

x: 1 1.5 2 2.5 3 3.5 4

y: 2 2.5 2.9 3.5 4.0 2.8 2.2

Estimate the area bounded by the curve, x-axis and the lines x = 1 and
x = 4.
4
[Hint: Area A = ∫1 y . dx ]
x3  1
17. Estimate the length of the arc of the curve y = from (0, 0) to 1, 
3  3
1
using Simpson’s rule taking 8 subintervals.
3
2
 dy 1
[Hint: Length of the arc S = ∫0 1 +  dx  . dx ]
18. Express the following as polynomials in x
1 1
(i) T0 ( x) + 2T1 ( x) + T2 ( x) (ii) 2T0 ( x) − T2 ( x) + T4 ( x)
4 8
(iii) 5T0 ( x) + 2T1 ( x) + 4T2 ( x) − 8T3 ( x)
19. Compute cos x correct to 3 significant digits. Obtain a series with minimum
number of terms using Taylors series and Chebyshev series.
20. Obtain a Taylor series expansion of ex and represent it in terms of Chebyshev
polynomials.

122 Self-Instructional Material

Numerical Integration
3.11 FURTHER READING and Differential
Interpolation

Mott, J.L. Discrete Mathematics for Computer Scientists, 2nd Edition. New
Delhi: Prentice-Hall of India Pvt. Ltd., 2007. NOTES
Chance, William A. Statistical Methods for Decision Making. Illinois: Richard
D Irwin, 1969.
Bonini, Charles P., Warren H. Hausman, and Harold Bierman. Quantitative
Analysis for Business Decisions. Illinois: Richard D. Irwin, 1986.
Charnes, A., W.W. Cooper, and A. Henderson. An Introduction to Linear
Programming. New York: John Wiley & Sons, 1953.
Hogg, Robert. V., Allen T. Craig and Joseph W. McKean. Introduction to
Mathematical Statistics. New Delhi: Pearson Education, 2005.

Self-Instructional Material 123

Solution of Algebraic and

UNIT 4 SOLUTION OF ALGEBRAIC Transcendental Equations

AND TRANSCENDENTAL
NOTES
EQUATIONS
Structure
4.0 Introduction
4.1 Unit Objectives
4.2 Types of Non-linear Equations
4.3 Intermediate Value Theorem
4.4 Methods of Finding Solutions of Algebraic and Transcendental Equations
4.4.1 Direct Methods
4.4.2 Iterative Method
4.4.3 Initial Guess
4.4.4 Rate of Convergence
4.5 Bisection Method
4.5.1 Minimum Number of Iterations Required in Bisection Method to Achieve
the Desired Accuracy
4.5.2 Convergence of Bisection Method
4.6 Regula–Falsi Method or Method of False Position
4.6.1 Convergence of Regula–Falsi Method
4.7 Secant Method
4.8 Fixed Point Iteration Method
4.8.1 Condition for Convergence of Iteration Method
4.8.2 Convergence of Iteration Method
4.9 Newton–Raphson Method
4.9.1 Geometrical Interpretation
4.9.2 Convergence of Newton–Raphson Method
4.9.3 Newton–Raphson Method for System of Non-Linear Equations
4.9.4 Generalized Newton’s Method for Multiple Roots
4.10 Summary
4.11 Key Terms
4.12 Answers to ‘Check Your Progress’
4.13 Questions and Exercises
4.14 Further Reading

4.0 INTRODUCTION

In this unit, you will learn that the solution of algebraic and transcendental equations
is a topic of interest not only to mathematicians but also to scientists and engineers.
There are analytical methods to solve algebraic equations of degree two or three
or four. However, the need often arises to solve higher degree algebraic equation
or transcendental equations for which direct or analytical methods are not available.
Thus, numerical techniques or approximate methods are required to solve such
Self-Instructional Material 125
Solution of Algebraic and kind of equations and before you proceed to solve such kind of equations you
Transcendental Equations
should find out the difference between algebraic and transcendental equations.
Let f(x) = a0 + a1 x + a2 x 2 + ... + an x n be a polynomial of degree n, where
NOTES a’s are constants such that a0 ≠ 0 and n is a positive integer.
The polynomial f(x) = 0 is called an algebraic equation of degree n. If f(x)
contains some other functions such as log x, ex, trigonometric functions or ax, etc.,
then f(x) = 0 is called a transcendental equation.
You will learn that the solution of these equations is nothing but to obtain a
value of x which satisfies f(x) = 0. This value is called the root of the equation f(x)
= 0. In other words, x = α is called a root of equation f(x) = 0 if f(α) = 0. The
process of finding the roots of an equation is known as the solution of that equation.
This is a problem of great importance in applied mathematics. To find out the roots
numerically you should make use of the fundamental theorem on roots of f(x) = 0
in a ≤ x ≤ b.

4.1 UNIT OBJECTIVES

After going through this unit, you will be able to:

• Know about intermediate value theorem
• Describe the Various Methods of algebraic and Transcendental Equations
• Understand Bisection Method
• Discuss Regula—Falsi Method and Secant Method
• Explain Iteration and Newton—Raphson

4.2 TYPES OF NON-LINEAR EQUATIONS

An equation that represents a straight line is known as linear equation whereas

those not representing a straight line are non-linear equations. A polynomial of first
degree is a linear equation whereas that having more than one degree is a non-
linear equation. Apart from this, equations involving logarithmic functions, reciprocal
functions, exponential functions and trigonometric functions, are all non-linear
equations.
Thus non-linear equations are of following types:
• Nonlinear algebraic equations involving polynomial of degree two and higher
o Quadratic Equations
o Cubic equations
o Quartic Equations
o Quintic Equations
o Sextic Equations, and like that.

126 Self-Instructional Material

• Trigonometric Equations Solution of Algebraic and
Transcendental Equations
• Logarithmic equations
• Exponential equations
Quadratic Equations NOTES

As we know, a polynomial equation of second degree is known as and has the

general form of:
ax2 + bx + c = 0,
where a, b, and c are constants, known as quadratic co-efficient and x is a variable,
and a ≠ 0. Quadratic equations are used for computation of trajectories in projectile
motion.
A quadratic equation having real or complex coefficients provides two
solutions and each of these solutions is known as root and hence, a quadratic
equation has two roots. These roots may not be distinct. Roots may not be real
also.
Roots of a quadratic equation is given as:
− b ± b 2 − 4ac
x = ,
2a
And thus, roots are:
− b + b 2 − 4ac − b − b 2 − 4ac
x = and
2a 2a
The expression b2 − 4ac is known as discriminant, usually denoted as D.
Depending on the sign of D roots can be real or complex. If D is negative then
roots are complex.
In general nonlinear algebraic problems are solvable, and in case they are
not so, these may be well understood by way of numeric and qualitative analysis.
For example, the equation given below as
x2 + x – 1 = 0
may also be written in a different way as below:
f(x) = C where f(x) = x2 + x and C = 1
This is nonlinear since f(x) can not satisfy additivity and homogeneity. Here
non-linearity is due to the term x2.
Cubic Equations
A cubic equation involves a cubic function which is given below:
f (x) = ax3 + bx2 + cx + d and a ≠ 0. This is a polynomial of degree three.
Derivative of a cubic function is polynomial of having degree two and hence it is a
quadratic function and its integral gives a quartic function.
Self-Instructional Material 127
Solution of Algebraic and If a function is equated to zero, it is known as equation and hence
Transcendental Equations
ax3 + bx2 + cx + d = 0 is a cubic equation in which coefficients are a, b,c, d and
these are, in general, real numbers. However, most of the theory is also valid if
they belong to a field of characteristic other than 2 or 3.
NOTES
Solution of a cubic equation means finding its roots or finding zeros of the
cubic function.
Quartic Equations
A polynomial of fourth degree is known as a quartic function and when equated to
zero, it becomes an equation of fourth degree and is called quartic equation which
is given as:
ax4 + bx3 + cx2 + dx + e = 0

where a, b, c, d and e are constants such that a ≠ 0 .

Quintic Equations
A quintic equation is formed as a polynomial function of degree five, equated to
zero and has the form:
ax5 + bx4 + cx3 + dx2 + ex + f = 0,
Where a, b, c, d, e, f are constants that may real or complex numbers where
a ≠ 0. These contain polynomial of odd degree and hence, their nature has some
similarity with cubic function and their graph may have additional local maxima
and local minima. There are some quintic equations such as x5 – x – 1 = 0 which
is not exactly solvable and these can be analyzed qualitatively analyzed.
Sextic Equations
A sextic equation contains polynomial of degree six and has the form:

ax6 + bx5 + cx4 + dx3 + ex2 + fx + g = 0 where a ≠ 0.

Constants a, b, c, d, e, f and g are also known as coefficients and it may be
real or complex numbers. They have even degree and hence have similarity with
quadratic functions having additional local maxima and local minima. A sextic
equation is also known as hexic equation.
Thus, equations involving polynomial of degree two or more are all non-
linear equations.
Trigonometric Equations
There are six trigonometric functions namely, sin, cos, tan, cot, sec and cosec.
Any equation involving such functions is a non linear equation. In most of the
analysis, sin x or cos x are expanded as series in terms of powers of x as given
below:

128 Self-Instructional Material

Solution of Algebraic and
x3 x5 x 7 Transcendental Equations
sin (x) = x − + –
3 ! 5! 7 !
x2 x4
and cos x = 1 − + – ...for all x NOTES
2 ! 4!

Logarithmic Equations
A logarithmic equation involves one or more terms having logarithmic function. A
logarithmic function of x can be expanded as an expansion in terms of different
powers of x and hence these are non-linear. Exapnsion of logarithmic function is
shown as
x 2 x3 x 4 xn
log (1 + x) = x − + – + – ... + (– 1) n – 1 + Rn
2 3 4 n
Putting x – 1 in place of x, we get expansion for log x. Thus,
log x = (x – 1) – (x – 1)2/2 + (x – 1)3/3 – (x – 1)4/4 + …..

Exponential Equations

These are equations involving exponential functions. An exponential function can

be expanded as a series containing terms having different powers of x forming a
polynomial as shown below:
∞
xn x 2 x3
e =
x
∑
n=0 n!
=1+ x + +
2 ! 3!
+ ... for all x

Such equations are non-linear.

In general, an equation involving nth power of x will have n numbers of
roots in which all may not be real. If n = 1 it is a linear function giving only one
value of x and if n > 1, it is not linear. If n = 2, it is quadratic, if n = 3, it is cubic and
like that. In a quadratic equation either both the roots will be real or both will be
complex. In a cubic equation one will be real and two other will be complex. In a
quartic equation, there will be four values in two pairs in which either both the
pairs will be real or both will be complex, or one pair will be real and another
complex. These may be very interesting observations.

4.3 INTERMEDIATE VALUE THEOREM

If f(x) is continuous in a closed interval [a, b] and f(a), f(b) have opposite signs,
i.e., f(a) . f(b) < 0, then the equation f(x) = 0 has atleast one root between x = a
and x = b.

Self-Instructional Material 129

Solution of Algebraic and Since f(x) is continuous y
Transcendental Equations
between a and b, so while x tα f(x)
changes from a to b, f(x) must roo
pass through all the values from f(b)
a
NOTES f(a) to f(b). As both are O x
x = α b
opposite in signs it follows that f(a)
f(α) = 0
at least for one value of x say
‘α’, a ≤ α ≤ b, f(x) must be
zero. Then ‘α’ is the desired
root (see Figure 4.1). Figure 4.1 Intermediate Value Theorem

Notes: 1. In an equation with real coefficients, imaginary roots occur in conjugate

pairs, i.e., if α + iβ is a root of the equation f(x) = 0, then α – iβ must
also be its root.
2. If a + b is an irrational root of an equation f(x) = 0, then a – b
must also be the root of equation f(x) = 0.
3. Every equation of the odd degree has atleast one real root.

4.4 METHODS OF FINDING SOLUTIONS OF

ALGEBRAIC AND TRANSCENDENTAL
EQUATIONS

There are two kinds of methods to find the solution:

• Direct method
• Iterative methods.
4.4.1 Direct Methods
This method find the roots of the equation f(x) = 0 in finite number of steps. For
example, the roots of the quadratic equation ax2 + bx + c = 0, where a ≠ 0 are
given by,
−b ± b 2 − 4ac
x =
2a
4.4.2 Iterative Method
These methods are based on the idea of successive approximations. These methods
start with an initial approximation to the root and obtain a sequence of
approximations by repeating a sequence of steps till a desired accuracy is achieved.
Iterative methods generally provide one root at a time. They are also very
cumbersome and time-consuming for solving equations manually but they are best
suited for use on computers as these methods can be concisely expressed as
computational algorithms. Also, the round-off errors are negligible in iterative
methods as compared to direct methods.

130 Self-Instructional Material

4.4.3 Initial Guess Solution of Algebraic and
Transcendental Equations
The main step in iterative method is to choose the right initial guess or approximation.
This can be done either by the use of ‘Intermediate Value Theorem’ or by plotting
the function graphically.
NOTES
4.4.4 Rate of Convergence
The fastness of convergence to the root in any method is represented by its rate of
convergence. Let xi be the value of the root at ith step and α be the exact root.
Then error e = α – xi = xi + 1 – xi. If ei + 1/ei is almost constant, convergence is
said to be linear or of order ‘1’.
If ei + 1/eip is nearly constant, convergence is said to be of order p.
In other words, an iterative method is said to be pth order convergent if p is
the largest integer such that,
ei + 1
Lim ≤k
i →∞ eip
where, k is a finite number.
Physically, the pth order convergence means that in each iteration the number
of significant digits in each approximation increases p times.

CHECK YOUR PROGRESS

1. Name the two methods for finding solutions ofAlgebraic and Transcendental
equations?
2. What is initial guess?
3. What is rate of convergence?

4.5 BISECTION METHOD

This method is based on the repeated application of the Intermediate Value Theorem.
Let f(x) be a continuous function between a and b, such that f(a) . f(b) < 0.
For definiteness, let f(b) > 0 and f(a) < 0. Then the first approximation to the root
α is given by ,
a +b
x0 = i.e., average of the ends of the range
2
Now, if f(x0) = 0, x0 is the root of f(x) = 0. Otherwise the root will lie
between x0 and a or between x0 and b depending upon the sign of f(x0).
Continue the same process of bisecting the interval until the root is found to
desired accuracy (see Figure 4.2).

Self-Instructional Material 131

Solution of Algebraic and
Transcendental Equations
y
=α
tx f(x)
roo
f(b)
NOTES a x1 x2
x
O x0 b
f(a)

Figure 4.2 Bisection Method

In general, if xi – 1 and xi is the (i – 1)th and ith approximations to the root
such that f(xi – 1) . f(xi) < 0, then the next approximation xi + 1 is found as,

1
xi + 1 = ( xi − 1 + xi )
2
Now, xi + 1 is called the root of the equation f(x) = 0 if f(xi + 1) = 0 or
|(xi + 1) – (xi)| ≤ ε where ‘ε’ is the desired accuracy. Clearly, ε is a very small
positive number.
Example 4.1: Find the root of the equation x3 – 3x – 7 = 0, using Bisection
method, correct upto 3 decimal places.
Solution: Let f(x) = x3 – 3x – 7 for initial guess.

Table 4.1 Use of Bisection Method to Calculate Roots

x: 0 1 2 3
f(x): –7 –9 –5 11
Sign: –ve –ve –ve +ve

Since f(2) is –ve and f(3) is +ve, a root lies between 2 and 3 by Intermediate
Value Theorem.
So, a = 2, b = 3, the first approximation to the root is,
a+b 2+3
x0 = = = 2.5
2 2
Now f(2.5) = (2.5)3 – 3(2.5) – 7 = 1.125 which is +ve. So, 2.5 is not the
root of f(x) = 0 and now the root lies between 2 and 2.5 as f(2) . f(2.5) < 0. The
second approximation is given as,
2 + 2.5
x1 = = 2.25
2

132 Self-Instructional Material

and f(2.25) = –2.3594 < 0, i.e., –ve. Solution of Algebraic and
Transcendental Equations
So, root lies between 2.25 and 2.5.
The third approximation is given by,
NOTES
1
x 2 = (2.25 + 2.5) = 2.375.
2
and f(x2 ) = –0.07285.
Thus, the root lies between 2.375 and 2.5.
The next approximations are given in Table 4.2.
Table 4.2 Table Showing Approximations

1
S.No. a b x= (a + b) f(a) f(b) f(x)
2
3. 2.375 2.5 2.4375 –ve +ve +ve
interval
4. 2.375 2.4375 2.40625 –ve +ve –ve
interval
5. 2.40625 2.4375 2.4219 –ve +ve –ve
interval
6. 2.4219 2.4375 2.4297 –ve +ve +ve
new interval
7. 2.4219 2.4297 2.4258 –ve +ve –ve

8. 2.4258 2.4297 2.4278 –ve +ve +ve

9. 2.4278 2.4297 2.4288 –ve +ve +ve

10. 2.4278 2.4288 2.4283 –ve +ve +ve

11. 2.4278 2.4283 2.4281 –ve +ve +ve

Clearly, | xi + 1 – xi |, i.e., | 2.4281 – 2.4283 | ≤ 0.0002 the result is correct

upto 3 decimal places. Thus, the root of f(x) = x3 – 3x – 7 = 0 is,
x = 2.4281

4.5.1 Minimum Number of Iterations Required in Bisection Method

to Achieve the Desired Accuracy
Let f(x) = 0 be the equation and the root α of f(x) = 0 lies between the interval
(a, b). Thus, f(a) . f(b) < 0 as α ∈ [a, b].

Self-Instructional Material 133

Solution of Algebraic and Let us rename the interval (a, b) as (a0, b0) and make it the initial guess.
Transcendental Equations
Then, the next approximation x1 to the root α is given as,
a0 + b0
x1 =
NOTES 2
Now, if f(x1) = 0, then x1 is the root of f (x) = 0 otherwise, either
f(x1) . f(a0) < 0 or f(x1) . f(b0) < 0. Then the root lies either in (a0, x1) or in
(x1, b0). You can rename the interval in which α-lies as (a1, b1).
i.e.,
1
b1 – a1 = (b0 − a0 )
2

a1 + b1
and x2 = ...(4.1)
2
where, x2 is the next approximation to the root α.
Now if x2 is the desired root of f(x) = 0, then f(x2) = 0. Otherwise either
f(x2) . f(a1) < 0 or f(b1) . f(x2) < 0, giving the interval (a1, x2) or (x2, b1) for the
root 2. After renaming the interval as (a2, b2), you have,
1
b2 – a2 = (b1 − a1 ) ...(4.2)
2
By substituting the value of (b1 – a1) from Equation (4.1) to Equation (4.2),
you have,
1
(b2 – a2) = (b0 − a0 )
22
Proceeding in the same way, you find,
an + bn
xn + 1 =
2
Which gives the (n + 1)th approximation to the root and root lie in (an, bn).
1
where, (bn – an) = (b0 − a0 ) ...(4.3)
2n
Here, (bn – an) gives the length of the interval in which root lies after
(n + 1)th approximation. Minimum length of the interval gives the maximum
accuracy in the root, if the desired accuracy is ∈, then,
| bn – an | ≤ ε

b0 − a0
⇒ ≤ε
2n

134 Self-Instructional Material

By taking log on both the sides, you have, Solution of Algebraic and
Transcendental Equations

log10 (b0 − a0 ) − log10 ε

n≥
log10 2
NOTES
where, n gives the minimum number of iterations required to achieve the desired
accuracy ε.
4.5.2 Convergence of Bisection Method
In bisection method, the original interval is divided into half in each iteration, by
bracketing the root in new interval, you have
xi − xi − 1
xi + 1 =
2
Now if you take the midpoints of the successive intervals to be the
approximations of the root, the one half of the current interval is the upper bound
to the error. Thus, if ei is the error in ith iteration, then the error eq + 1 in the (i + 1)th
iteration is given by
1 ei + 1 1
ei + 1 = ei or =
2 ei 2

ei + 1
Comparing this with ≤ k, you have p = 1. Hence, Bisection method
eip
is linearly convergent.
Example 4.2: Using Bisection method, compute the smallest positive root of the
equation xex – 2 = 0, correct upto three decimal places.
Solution: Let f(x) = xex – 2
f(0) = –2, i.e., negative and f(1) = 0.7183, i.e, positive.
Thus, the root exist between 0 and 1.
You know that Bisection method is,
xn −1 + xn
xn + 1 =
2
Here xn – 1 = 0 and xn = 1
x0 + x1 0 + 1
so, x2 = = = 0.5.
2 2
and, f(x2 ) = –1.1756, i.e., negative.
Thus, the root exists between (0.5, 1).
0.5 + 1
Also, x3 = = 0.75
2
Self-Instructional Material 135
Solution of Algebraic and and f(x3 ) = –0.4122, i.e., negative.
Transcendental Equations
So root lies between (0.75, 1).
Next iteration is,
NOTES 0.75 + 1
x4 = = 0.875
2
and f(x4 ) = 0.875.e0.875 – 2 = 0.0990, i.e, positive.
So, the next interval for the root is (0.75, 0.875) and hence,
0.75 + 0.875
x5 = = 0.8125
2
and f(0.8125) = – 0.1690, i.e, negative.
Next interval (0.8125, 0.875).
So, next iteration is,
0.8125 + 0.875
x6 = = 0.8438
2
and f(x6 ) = –0.0381, i.e, negative.
Next interval is (0.8438, 0.875).
0.8438 + 0.875
So, x7 = = 0.8594
2
and f(x7 ) = 0.0297, i.e, positive.
Next interval for root is (0.8438, 0.8594).
0.8438 + 0.8594
and x8 = = 0.8516
2
∴ f(x8 ) = –0.0044, i.e, negative.
Next interval is (0.8516, 0.8594).
0.8516 + 0.8594
and x9 = = 0.8555
2
∴ f(x9 ) = 0.0126, i.e, positive.
So, the next interval is (0.8516, 0.8555).
0.8516 + 0.8555
and x10 = = 0.8536
2
∴ f(x10) = 0.0042, i.e, positive.
Next interval is (0.8516, 0.8536)
0.8516 + 0.8536
and x11 = = 0.8526
2
136 Self-Instructional Material
∴ f(x11) = –0.000024, i.e, negative. Solution of Algebraic and
Transcendental Equations
Giving the next interval for root is (.8526, .8536).
0.8526 + 0.8536
and x12 = = 0.8531 NOTES
2
∴ f(x12) = 0.00215, i.e., positive.
Next interval is (0.8526, 0.8531).
0.8526 + 0.8531
and x13 = = 0.8529
2
∴ f(x13) = 0.0012, i.e., positive.
Since the two interations x12 and x13 are approximately same as
|(x12 – x13)| = 0.0002.
So, the desired root of the given equation is 0.8529 correct upto 3-decimal
places.

4.6 REGULA–FALSI METHOD OR METHOD OF

FALSE POSITION

Regula–Falsi or method of false position is the oldest method of finding roots of

the equation f(x) = 0.
Let f(x) = 0 be an equation and let a and b, be the two values of x such that
f(a) . f(b) < 0. Then according to intermediate theorem there exists atleast one
root x = α of f(x) = 0 for which f(α) = 0. This α is the value at x-axis, where the
curve f(x) intersects the x-axis while moving from x = a to x = b, as f(a) . f(b) < 0
(see Figure 4.3).

y A (a, f(a))

x1 x0 b x
O a

C B(b, f(b))
(x0,f(x0))

Figure 4.3 Regula–Falsi Method

The equation of the chord joining the two points A(a, f(a)) and B(b, f(b)) is
given as,

Self-Instructional Material 137

Solution of Algebraic and
Transcendental Equations f (b ) − f ( a )
y–b = .( x − a ) ...(4.4)
b−a
In the interval (a, b) this can be considered as a straight line, so the
NOTES intersection of the line in Equation (4.4) and x-axis can be taken as the new
approximation to the root. So, the abscissa of the point where the chord cuts the
x-axis (y = 0) is given by,
b−a
x0 = a – . f (a)
f (b) − f (a)

a f (b) − bf (a )
or x0 =
f (b) − f (a )
which is the new approximation to the root. Now if f(a) and f(x0) are of opposite
signs, the root will lie between (a, x0) otherwise the root will lie in (x0, b). Now if
f(a) . f(x0) < 0, you will replace b by x0 and obtain the new approximation x1. If
f(x0) . f(b) < 0, than you will replace a by x0 and obtain the next approximation
x1 .
In general, if the root lies between (xi – 1, xi), i.e, f(xi – 1) . f(xi) < 0, then the
next approximation xi + 1 to the root ‘α’ is obtained as,
xi −1. f ( xi ) − xi f ( xi −1 )
xi + 1 =
f ( xi −1 ) − f ( xi )

Continue the same process till f(xi + 1) = 0 or | (xi + 1 – xi) | < ε, where ε is
a very small positive number known as desired accuracy. Usually the value of ε is
taken as,
1
ε = × 10−3
2
Example 4.3: Using Regula–Falsi method compute the smallest positive root of
the equation xex – 2 = 0 correct upto three decimal places.
Solution: f(x) = xex – 2
For initial guess

x: 0 1
Sign of f(x): –ve +ve

Since f(0) < 0 and f(1) > 0, the root for f(x) = xex – 2 lies between (0, 1).
i.e., f(0) = –2, f(1) = 0.7183
Since x0 = 0 and x1 = 1, the next approximation x2 is obtained as,

138 Self-Instructional Material

Solution of Algebraic and
x f ( x ) − x1 f ( x0 ) 0 − 1.( −2)
x2 = 0 1 = Transcendental Equations
f ( x1 ) − f ( x0 ) 0.7183 − ( −2)
= 0.7358
NOTES
and f(x2 ) = –0.4644, i.e., negative.
Thus, x0 will be replaced by x2 and the root will now lie between (0.7358, 1).
So,
x2 f ( x1 ) − x1 f ( x2 )
x3 =
f ( x1 ) − f ( x2 )

0.7358 × (0.7183) − 1× ( −0.4644)

=
0.7183 − ( − 0.4644)
= 0.8395
and f(x3 ) = –0.0563, i.e., negative.
Now the next interval for x4 is (x3, x1).
So, x 0 = x3 = 0.8395 and x1= 1
f(x0 ) = f(x3) = – 0.0563
f(x1 ) = 0.7183
x0 f ( x1 ) − x1 f ( x0 )
x4 =
f ( x1 ) − f ( x0 )

0.8395 × (0.7183) − 1 × (−0.0563)

=
(0.7183 + 0.0563)
= 0.8512 and f(x4) = –0.0062
Next interval of approximation is (0.8512, 1) and x5 can be calculated as,
0.8512.(0.7183) − 1.(−0.0062)
So, x5 =
(0.7183 + 0.0062)
= 0.8525 and f(x5) = –0.0005
Next interval of approximation is (0.8525, 1) and x6 can be calculated as,
0.8525 × (0.7183) − 1.(−0.0005)
x6 =
(0.7183 + 0.0005)
= 0.8526 and f(x6) = –0.0000239 ~ 0.0000
Clearly, the value of f(x) at x = x6 is approximately equal to zero. That is, x6
satisfies f(x) = 0 and also | x6 – x5 | = 0.0001.

Self-Instructional Material 139

Solution of Algebraic and It is the desired accuracy as the desired result should be correct upto three
Transcendental Equations
decimal places.
Thus, x = 0.8526 is the root of the equation xex – 2 = 0.
NOTES 4.6.1 Convergence of Regula–Falsi Method
The general formula for Regula–Falsi method is,

xi −1. f ( xi ) − xi . f ( xi −1 )
xi + 1 =
f ( xi ) − f ( xi −1 )

This can be rewritten as,

( xi − xi −1 )
xi + 1 = xi − . f ( xi ) ...(4.5)
f ( xi ) − f ( xi −1 )

Now, let α be the root of f(x) = 0 and let ei – 1, ei and ei + 1 be the errors
associated with (i – 1)th, ith and (i + 1)th iterations.
Then, xi + 1 = α + ei + 1, xi = α + ei and xi – 1 = α + ei – 1
By Equation (4.5) you have,
[α + ei ) − (α + ei − 1 )]
(α + ei + 1) = (α + ei ) − × f (α + ei )
f (α + ei ) − f (α + ei − 1 )

(ei − ei − 1 )
or ei + 1 = ei − × f (α + ei ) ...(4.6)
f (α + ei ) − f (α + ei − 1 )

By expanding f(α + ei) and f(α + ei – 1) using Taylor’s series upto third
term, you get,

(ei − ei − 1 )
ei + 1 = ei − × f (α + ei )
 e2  
 f (α) + ei f ′(α) + i f ′′(α) + ...  − 
 2!  
 e 2 
 ( f (α) + ei − 1 f ′(α) + i − 1 f ′′(α) + ...
 2! 
After simplifying the denominator, you have,

  e2 
 (ei − ei − 1 )  f (α) + ei f ′(α) + i f ′′(α) + ... 
 2! 
ei + 1 = ei − 
  2 2
ei − ei −1  
 (ei − ei − 1 ) f ′(α) + f ′′(α)  
  2  

140 Self-Instructional Material

Solution of Algebraic and
 ei2  Transcendental Equations
 f ( α ) + e f ′( α ) + f ′′(α )  
 
i
2!
or ei + 1 = ei −  ...(4.7)
  (ei + ei −1 )  
  f ′(α ) + f ′′(α)  
  NOTES
 2
Since α is the root of f(x) = 0, thus f(α) = 0 from Equation (4.7) you have,

 ei2 
 ei f ′(α) + f ′′(α) 
ei + 1 = ei −  2! 
 f ′(α) + i ei −1 ) f ′′(α) 
( e +
 2 

 ei2 f ′′(α ) 
 ei + 
2 f ′(α ) 
or ei + 1 = ei − 
 (ei + ei −1 ) f ′′(α) 
1 + f ′(α) 
 2

–1
 ei2 f ′′(α )   (ei + ei −1 ) f ′′(α ) 
= ei −  ei +  1+
2 f ′(α )   f ′(α ) 
or ei + 1
 2
Using Binomial expansion and retaining the first two terms, you get,
 e2 f ′′(α)    ei + ei −1  f ′′(α) 
ei + 1 = ei − ei + i .  1−  .
 2 f ′(α)    2  f ′(α) 

 ei ( ei + ei − 1 ) f ′′( α ) ei2 f ′′( α )

or ei + 1 = ei –  ei − . + .
 2! f ′( α ) 2 f ′( α )

ei2 (ei + ei −1 )  f ′′(α) 2 

– .  .
4  f ′(α)  
2 2
f ′′(α ) ei (ei + ei − 1 )  f ′′(α) 
= ei ei − 1 . − . 
f ′(α ) 4  f ′(α ) 
If ei and ei – 1 are very small, then ignoring the terms involving
ei2 , ei3 and ei2 ei − 1 , you get,

f ′′(α )
ei + 1 ~ ei ei − 1 .
f ′(α )
or ei + 1 = ei ei − 1 . M ...(4.8)

Self-Instructional Material 141

Solution of Algebraic and
Transcendental Equations f ′′(α)
where M = is a constant.
f ′(α)
Now in order to find out the order of convergence, you need to find out
NOTES some number p such that,

ei + 1
≤k
eip

or ei + 1 = eip . k ...(4.9)

or ei = eip−1. k
1/ p
e 
or ei – 1 =  i 
k
Substitute the values of ei + 1 and ei – 1 in equation (4.8), you have,
1/ p
e 
k . eip = ei ×  i  .M
k

 1 1
− 1 +  1+
 
or eip = M ×K P × ei p ...(4.10)
Comparing the powers of ei on both the sides of (4.10), you have,
1
p =1+ or p2 – p – 1 = 0
p
which on solving gives,
 1± 5 
p= 
 2 
Taking only the positive sign, you have,
p = 1.1618
which is the rate of convergence of Regula–Falsi method.

4.7 SECANT METHOD

In Secant method, you do not require the condition of f(a) . f(b) < 0. Thus, this
method is an improvement over Regula–Falsi method. Appart from this, Secant
method is just like Regula–Falsi method.
In this method also the graph of y = f(x) is approximated by a secant
line but at each iteration the two most recent approximations are used to find
142 Self-Instructional Material
out the next approximation. Thus, in this method, it is not required that the Solution of Algebraic and
Transcendental Equations
interval which you take for finding out the next approximation to the root must
contain the root.
The formula for Secant method is just the same as in case of Regula–Falsi
NOTES
method.

xi −1 f ( xi ) − xi . f ( xi −1 )
xi + 1 =
f ( xi ) − f ( xi −1 )

where (xi – 1, xi) is the most recent interval. Thus, the condition f(xi – 1) . f(xi) < 0,
is not required in this method.
Convergence of Secant method is 1.1618 as in case of Regula–Falsi method.
Example 4.4: Find a real root of the equation x log10x = 1.2 by Secant method,
correct upto three decimal places.
Solution: f(x) = x log10x – 1.2
Let the initial approximations be x0 = 1 and x1 = 2, then f(x0) = –1.2 and
f(2) = – 0.59794.
So, by Secant method, you have,
x1 − x0
x 2 = x1 − f ( x1 )
f ( x1 ) − f ( x0 )

2 −1
= 2− × (−1.2) = 3.9932
(−0.59794 + 1.2)
and f(x2 ) = 1.2012.
Next interval is (2, 3.9932).
x2 − x1
So, x 3 = x2 – . f ( x2 )
f ( x2 ) − f ( x1 )

3.9932 − 2
= 3.9932 – (1.2012)
1.2012 + 0.59794
= 2.6624
and f(x3 ) = – 0.0678
Next interval is (3.9932, 2.6624).
x3 − x2
So, x 4 = x3 − . f ( x3 ) = 2.7335
f ( x3 ) − f ( x2 )
and f(x4 ) = – 0.0062.

Self-Instructional Material 143

Solution of Algebraic and Next interval is (2.6624, 2.7335).
Transcendental Equations
x4 − x3
So,x5 = x4 − × f ( x4 )
f ( x4 ) − f ( x3 )
NOTES
2.7335 − 2.6624
= 2.7335 − × (−0.0062)
(−0.0062 + 0.0678)
= 2.7407
and f(x5 ) = 0.00005
x5 − x4
∴ x 6 = x5 − × f ( x5 )
f ( x5 ) − f ( x4 )

2.7407 − 2.7335
= 2.7407 − × (−0.00005)
(0.00005 + 0.0062)
= 2.7406
Since | x5 – x6 | = 0.0001, the value 2.7406 is correct upto three decimal
places and hence the desired root of x log10x – 1.2 is 2.7406.

4.8 FIXED POINT ITERATION METHOD

Let f(x) = 0 be the equation. To find out the root of this equation by successive
approximations, rewrite f(x) = 0 in the form of ,
x = φ(x) ...(4.11)
The roots of f(x) = 0 are the same as the points of intersection of the straight
line y = x and the curve x = φ(x).
Let x = x0 be the initial approximation to the root α (see Figure 4.4). Then,
the first approximation x1 is given by,
x 1 = φ(x0)
x
y =
y

x
O x0 x2 x3 x 1
Figure 4.4 Iteration Method

144 Self-Instructional Material

Second approximation x2 is given by taking x1 is the initial approxi- Solution of Algebraic and
Transcendental Equations
mation as,
x 2 = φ(x1)
and so on, the ith approximation is given as
NOTES
xi = φ( xi − 1 )

Example 4.5: Solve the following using Iteration method:

(i) x – sin x = 0.5 (ii) cos x = 2x – 1
find the root correct upto three decimal places.
Solution: (i) f(x) = x – sin x – 0.5 = 0.
f(1) = –0.3415, i.e., negative.
and f(2) = 0.5907, i.e., positive.
Thus, the root lies between (1, 2).
Let x 0 = 1.4 then, the given equation may be written as,
x = sin x + 0.5 = φ(x)
Thus, φ′(x) = cos x
Now | cos x | < 1 between (2, 3). So, you can apply iteration method. You
have,
xn + 1 = φ(xn), n ≥ 0.
Thus, xn + 1 = (sin xn + 0.5)
Put n = 0, the first Iteration gives,
x 1 = sin (1.4) + 0.5 = 1.4854
x 2 = φ(x1) = 1.4964
x 3 = φ(x2) = 1.4972
x 4 = φ(x3) = 1.4973
x 5 = φ(x4) = 1.4973
Since x4 and x5 are same. The root of x – sin x – 0.5 = 0 is 1.497 correct
upto three decimal places.
(ii) cos x = 2x –1. Let f(x) = cos x – 2x + 1, then cos (0) is +ve and
cos(1) is –ve.
Thus, the root lies between (0, 1).
Let x0 = 0.5, then the equation can be rewritten as,
1
x = (cos x + 1) ≡ φ( x )
2
1
Thus, φ′(x) = – sin x
2
Self-Instructional Material 145
Solution of Algebraic and
Transcendental Equations 1
| φ′(x) | = | sin x | < 1 for (0, 1)
2
The Iteration method xn + 1 = φ(xn) can be applied by taking x0 = 0.5.
NOTES
1
So, x 1 = φ(x0) = (sin 0.5 + 1) = 0.7397
2
x 2 = φ(x1) = 0.8370
x 3 = φ(x2) = 0.8713
x 4 = φ(x3) = 0.8826
x 5 = φ(x4) = 0.8862
x 6 = φ(x5) = 0.8873
x 7 = φ(x6) = 0.8877
Since x6 and x7 are almost same. The root of cos x = 2x – 1 correct upto
three decimal places is 0.888.
Example 4.6: Apply Iteration method to obtain roots of the following equation
(i) x3 + x – 5 = 0 (ii) ex = 4x
Solution: (i) f(x) = x3 + x – 5
f(0) = –5, negative; f(1) = –3, negative
and, f(2) = 5, positive.
Thus, the root lies between (1, 2). Let, x0 = 1.5,
The equation can be written as,
5
(i) x = –x3 + 5 ≡ φ1(x); (ii) x = 3 5 − x ≡ φ2 ( x) and (iii) x = ≡ φ3 ( x)
x2 + 1
Clearly, | φ1′ ( x) | and | φ′3 ( x) | are not less than 1 for (1, 2).

1 1
Where as φ′2(x) = + × < 1 for (1, 2)
3 (5 − x) 2/3

So, you can apply Iteration method, you have,

xn + 1 = φ(xn)

with x0 = 1.5 and φ(xn) = 3 5 − xn .

The first iteration is,

x 1 = φ( x0 ) = 3 5 − (1.5) = 1.5183

x 2 = φ( x1 ) = 1.5156

146 Self-Instructional Material

Solution of Algebraic and
x 3 = φ( x2 ) = 1.5160 Transcendental Equations

x 4 = φ( x3 ) = 1.5160.
Since x3 and x4 are same. The root of the equation x3 + x – 5 = 0 is NOTES
1.5160.
(ii) f(x) = ex – 4x, f(0) = 1, i.e., positive
f(1) = e′ – 4 = –1.2817, i.e., negative.
Thus, the root lies between (0, 1).
Let,
x 0 = 0.5
ex
The equation can be rewritten as x = ,
4

ex ex
then, φ(x) = , φ′( x) = and hence | φ′( x ) | < 1 for (0, 1).
4 4
Thus, you apply Iteration method, to get,

ex
xn + 1 = φ(xn) with φ(x) =
4
and x 0 = 0.5

e0.5
then, x 1 = φ(x0) = = 0.4122
4

e0.4122
x 2 = φ(x1) = = 0.3775
4

e0.3775
x 3 = φ(x2) = = 0.3647
4

e0.3647
x 4 = φ(x3) = = 0.3600
4
x 5 = φ(x4) = 0.3583
x 6 = φ(x5) = 0.3577
x 7 = φ(x6) = 0.3575
x 8 = φ(x7) = 0.3574
Since x7 and x8 are the same. The root of equation ex = 4x is 0.3574
correct upto three decimal places.

Self-Instructional Material 147

Solution of Algebraic and 4.8.1 Condition for Convergence of Iteration Method
Transcendental Equations
It is not true that every sequence of x0, x1, x2 ... will converge to the root, for this
you have to choose the initial x0 suitably so that the successive approximations x1,
NOTES x2 ... xi converge to the root α in finite number of steps. The following theorem
helps in finding out the right initial guess.
Theorem 4.1. If α be the root of the equation x = φ(x), which is equivalent to
f(x) = 0 and I be any interval containing the root (x = α). Also | φ′( x ) | < 1 for all
x ∈ I, then the sequence x1, x2, ..., xi will converge to the root α provided the
initial guess x0 ∈ I.
Thus, the interactive method will converge to the root if | φ′( x ) | < 1 .
Notes: 1. The smaller the value of φ′(x), the faster will be the convergence.
2. Iterative method is very useful in finding out the real roots of an equation
given in the form of an infinite series.
4.8.2 Convergence of Iteration Method
The Iteration method for f(x) = 0 is,
x = φ(x)
in general, xn + 1 = φ(xn) ...(4.12)
Let α be the exact root of f(x) = 0. Also let ei and ei + 1 be the errors
associated with ith and (i + 1)th iterations. Then, by Equation (4.12), you have
(α + ei + 1) = φ(α + ei) ...(4.13)
Applying Taylor’s theorem in Equation (4.13), you have,

ei2
(α + ei + 1) = φ(α ) + ei φ′(α) + φ′(α ) + ...
2!
Since ei is small, neglecting the second and higher order terms of ei, you
have,
(α + ei + 1) = φ(α) + ei . φ′(α)
or ei + 1 = ei φ′(α ) + (φ(α ) − α ) ...(4.14)
Now, since α is the root of f(x) = 0, therefore α = φ(α) and Equation
(4.14) reduces to,
ei + 1 = eiφ′(α)
ei + 1
or = φ′(α)
ei

For convergence of Iteration method | φ′( x ) | < 1 . Thus, iteration method is

linearly convergent provided | φ′( x ) | < 1 .
148 Self-Instructional Material
Solution of Algebraic and
4.9 NEWTON–RAPHSON METHOD Transcendental Equations

Let y = f(x) or f(x) = 0 be the equation and let x = x0 be the initial approximation
to the root. NOTES
Let x = x1 be the exact root, then f(x1) = 0 and x1 = x0 + h.
Where, h is a small positive number known as step size.
⇒ f(x0 + h) = 0 (as f(x1) = 0)
Expanding by Taylor’s theorem, you have,
h2 h3
f ( x0 ) + hf ′( x0 ) + f ′′( x0 ) + f ′′′( x0 ) + ... = 0
2! 3!
Since h is a very small number, so, you can neglect the terms involving h2,
3
h , ... .
So, after neglecting the higher order terms, you have,
f ( x0 ) + hf ′( x0 ) ~ 0

f ( x0 )
or h =–
f ′( x0 )

f ( x0 )
Thus, x 1 = x0 + h = x0 –
f ′( x0 )

f ( x0 )
so, x1 = x0 − gives the bett er appro ximation to the root of
f ′( x0 )
f(x) = 0 than x0.
Similarly, if x2 denotes a better approximation than x1, you have,
f ( x1 )
x 2 = x1 −
f ′( x1 )
Continue in this manner, you have
f ( xn )
xn + 1 = xn −
f ′( xn )
This is known as Newton–Raphson method.
4.9.1 Geometrical Interpretation
Let x0 be the initial approximation near to the root α of f(x) = 0. Then, the equation
of tangent drawn at point A(x0, f(x0)) on the curve y = f(x) is given as,
y – f(x0) = (x – x0) . f′(x0) ...(4.15).
This cuts the x-axis at point x1 and y = 0.

Self-Instructional Material 149

Solution of Algebraic and So, by Equation (4.15), you have
Transcendental Equations
–f(x0 ) = (x1 – x0) f′(x0)
f ( x0 )
NOTES or, x1 = x0 −
f ′( x0 )
which is the first approximation to the root α (see Figure 4.5).
y
A
(x0 ,f(x0)

root α B
x
x2 x1 x0

y = f(x)

Figure 4.5 Geometrical Interpretation

Now, B is the point on curve y = f(x), corresponding to x1. Then, the

tangent drawn at B(x1, f(x1)) will cut the x-axis at some point x = x2 which is
nearer to root α. Thus, x2 will give the second approximation to the root and is
given as,
f ( x1 )
x2 = x1 −
f ′( x1 )
Proceeding in this manner, the (n + 1)th approximation to the root is given as:
f ( xn )
xn + 1 = xn −
f ′( xn )
This method is based on replacing the part of the curve between point
A and x-axis by means of the tangent to the curve at A.
4.9.2 Convergence of Newton–Raphson Method
Let f(x) = 0 be the equation and α be the exact root, i.e, f (α) = 0. The two
approximations to the root a are xn and xn + 1, respectively.
Then xn = α + εn and xn + 1 = α + εn + 1
where εn and εn + 1 are the error associated with approximations xn and xn + 1.

150 Self-Instructional Material

Now Newton–Raphson formula is, Solution of Algebraic and
Transcendental Equations
f ( xn )
xn + 1 = xn – ...(4.16)
f ′( xn )
NOTES
Substituting the values of xn and xn + 1 in Equation (4.16), you have,
f (α + ε n )
α + εn + 1 = α + εn –
f ′(α + ε n )

f (α + ε n )
or εn + 1 = ε n − ...(4.17)
f ′(α + ε n )

Expanding f (α + ε n ) and f ′(α + ε n ) in Equation (4.17) using Taylor’s

’s
series, you have,

ε 2n f ′′(α )
f (α ) + ε n . f ′(α ) + + ...
εn + 1 = ε n − 2! ...(4.18)
ε n2
f ′(α ) + ε n . f ′′(α ) + f ′′′(α ) + ...
2!
Since α is the exact root of f(x) = 0, thus f (α) = 0. Then from Equation
(4.18) you have,

ε 2n f ′′(α )
f (α ) + ε n . f ′(α ) + + ...
εn + 1 = εn − 2!
ε2
f ′(α ) + ε n . f ′′(α ) + n f ′′′(α ) + ...
2!

ε 2n ε3
f ′′(α ) + n f ′′′(α) + ...
or εn + 1 = 2 3
ε2
f ′(α ) + ε n f ′′(α) + n f ′′′(α ) + ...
2!
Since εn is very-very small, neglecting the third and higher power of εn,
you have,

ε 2n f ′′(α)
εn + 1 =
2[ f ′(α) + ε n f ′′(α )]

ε 2n  f ′′(α ) 
~ . 
2  f ′(α) 
That is, the error at (n + 1)th iteration is proportional to the square of the
error at nth iteration.

Self-Instructional Material 151

Solution of Algebraic and Hence, the convergence of Newton–Raphson method is quardratic of
Transcendental Equations
second order.
Notes: 1. Newton–Raphson method is useful in case of large values of f ′(x).
Hence, the graph of f (x) while intersecting the x-axis is nearly vertical.
NOTES
2. Newton–Raphson method converges to root provided the initial guess
x0 is choosen sufficiently close to the root α. If it is not near, the
procedure may lead to an endless cycle.
f ( xn )
Comparing, xn + 1= xn −
f ′( xn )
with the relation xn + 1 = φ(xn) of iteration method.
f ( xn )
You have, φ(xn ) = xn − ...(4.19)
f ′( xn )
You know that iteration method converges if,
| φ′( x ) | < 1
Differentiating Equation (4.19) with respect to x, you get,
f ( xn ). f ′′( xn )
φ′( x) =
[ f ′( xn )]2

f ( xn ). f ′′( xn )
which gives that <1
[ f ′( xn )]2
Thus, Newton–Raphson method will converge if,

f ( xn ). f ′′( xn ) < [ f ′( xn )]2

Example 4.7: Evaluate 6 125 by using Newton–Raphson method.

Solution: Let, x = 6
125
then, x 6 = 125
or x6 – 125 = 0
let f(x) = x6 – 125
Apply Newton–Raphson method, you have,
f ( xn )
xn + 1 = xn − f ′( x )
n

f ′(x) = 6x5

152 Self-Instructional Material

Solution of Algebraic and
xn6 − 125 Transcendental Equations
so, xn + 1 = xn –
6 xn5

5 xn6 + 125 NOTES

=
6 xn5
Since f(2) is negative and f(3) is positive, the root lies between (2, 3).

5(2.5)6 + 125
Let x0 = 2.5, then, x1 = = 2.2967
6(2.5)5

5(2.2967)6 + 125
x2 = = 2.2399
6(2.2967)5

5(2.2399)6 + 125
x3 = = 2.2361
6(2.2399)5

5(2.2361)6 + 125
x4 = = 2.23606
6(2.2361)5
~ 2.2361 (Correct up to four decimal places)
Since x3 and x4 are same, the root of f(x) = x6 – 125 is 2.2361
or 6
125 = 2.2361.

1
Example 4.8: Obtain the Iterative formula for a , 3 a and using
a
Newton–Raphson method.
Solution: (i) For a:

Let, x = a then, x2 – a = 0.
Also, let, f(x) = x2 – a
Then, according to Newton–Raphson method, you have,
f ( xn )
xn + 1 = xn – ...(1)
f ′( xn )

f(xn ) = xn2 − a and f′(xn) = 2xn

Thus, by Equation (1), you have,

Self-Instructional Material 153

xn2 − a
Solution of Algebraic and
Transcendental Equations
xn + 1 = xn −
2 xn

NOTES xn2 + a 1  a
or xn + 1 = =  xn + 
2 xn 2 xn 

Thus iterative formula for a is,

1 a 
xn + 1 =  xn + 
2 xn 

(ii) For 3 3
a : Let, x = a
⇒ 3
x –a =0
Let, f(x) = x3 – a
Then, f(xn) = xn3 − a and f′(xn) = 3 xn2 . Substituting these values in
Equation (1), you have,

xn3 − a
xn + 1 = xn −
3xn2

2 xn3 + a
or xn + 1 =
3xn2

1 a
or xn + 1 =  2 xn + 2  .
3 xn 

This is the required iterative formula for 3 a .

1
(iii) For :
a

1
Let, x=
a
1
or x2 – =0
a
1
Also let, f(x) = x2 –
a
1
then, f(xn ) = x2n – and f′(xn) = 2xn
a

154 Self-Instructional Material

substituting these values in Equation (1), you have, Solution of Algebraic and
Transcendental Equations
 2 1
 xn − 
xn + 1 = xn −  a
2 xn
NOTES
1
xn2 +
or xn + 1 = a
2 xn

1 1 
or xn + 1 =  xn + .
2 axn 
1
which is the required Iterative formula for
using Newton–Raphson method.
a
Example 4.9: Use Newton–Raphson method to find the smallest positive root of
the equation tan x = x.
Solution: Given, tan x = x
or tan x – x = 0
Let, f(x) = tan x – x
∴ f ′(x) = sec2x – 1
and f (–1) = – 0.5574 i.e., negative.
and f(1) = +0.5574 i.e., positive.
This root lies between (–1, 1).
Let, x0 = 1
then Newton–Raphson formula is,
f ( xn )
xn + 1 = xn −
f ′( xn )

tan (1) − (1)

Then, x1 = 1 –
sec2 (1) − 1

0.5574
=1–
+0.8508
= 0.3449
tan (0.3449) − 0.3449
Also, x 2 = 0.3449 –
sec 2 (0.3449) − 1
= 0.3449 – 0.0127

Self-Instructional Material 155

Solution of Algebraic and = 0.3322
Transcendental Equations
Next iterations are shown in Table 4.3
Table 4.3 Calculation of Smallest +ve Root of Equation tan x = x
NOTES
tan xn – xn
Iteration No. xn xn + 1 = xn –
sec 2 xn – 1
3. 0.3322 0.2248
4. 0.2248 0.1509
5. 0.1509 0.1009
6. 0.1009 0.0675
7. 0.0675 0.0452
8. 0.0452 0.0298
9. 0.0298 0.0100
10. 0.0100 0.0067
11. 0.0067 0.0042
12. 0.0042 0.0029
13. 0.0029 0.0019
14. 0.0019 0.0013
15. 0.0013 0.0009

Since the error between x15 and x14 is | 0.0009 – 0.0013 | = 0.0004. The
smallest positive root is 0.0009 correct upto three decimal places.
4.9.3 Newton–Raphson Method for System of Non-Linear Equations
Let f1(x, y) = 0 and f2(x, y) = 0 be two non-linear equations.
Let (x0, y0) be the initial approximation to the solution of the system and let
(x1, y1) be the exact solution.
Then, x 1 = x0 + h and y1 = y0 + k
where h and k are very small numbers known as step size.
Then, you must have,
f1(x1, y1) = 0 and f2(x1, y1) = 0
Expanding using Taylor’s series, you have,
∂f1 ∂f
f1 ( x0 + h, y0 + k ) = f1 ( x0 , y0 ) + h + k 1 + ...
∂x0 ∂y0

∂f 2 ∂f
and f 2 ( x0 + h, y0 + k ) = f 2 ( x0 , y0 ) + h + k 2 + ...
∂x0 ∂y0

156 Self-Instructional Material

Neglecting the second and higher order terms, as h and k are very small, Solution of Algebraic and
Transcendental Equations
you have,
∂f1 ∂f
h + k 1 = –f1(x0, y0) ...(4.20)
∂x0 ∂y0 NOTES

∂f 2 ∂f
and h + k 2 = –f2(x0, y0) ...(4.21)
∂x0 ∂y0

Thus, the next approximation of the solution is obtained as x1 = x0 + h and

y1 = y0 + k.
Example 4.10: Solve x2 – y2 = 3 and x2 + y2 = 16.
By Newton–Raphson method. Given x0 = y0 = 2.54
Solution: f1(x, y) = x2 – y2 – 3 and f2(x, y) = x2 + y2 – 16 given x0 = y0 = 2.54.
So, f1(x0, y0) = – 3 and f2(x0, y0) = – 3.0968
∂f1 ∂f1 ∂f ∂f
and = 2x, = –2y Also 2 = 2x and 2 = 2y.
∂x ∂y ∂x ∂y

∂f1 ∂f ∂f ∂f
Thus, = 5.08; = –5.08; 2 = 5.08 and 2 = 5.08
∂x0 ∂y0 ∂x0 ∂y0
Newton–Raphson formula for set of non-linear equations is,
∂f1 ∂f
h + k 1 = –f1(x0, y0) ...(1)
∂x0 ∂y0

∂f 2 ∂f
and h + k 2 = –f2(x0, y0) ...(2)
∂x0 ∂y0
x 1 = x0 + h and y1 = y0 + k.
By putting the values in Equations (1) and (2), you have,
5.08h – 5.08k = –(–3)
5.08h + 5.08k = +3.0968
then, h = 0.2363 and k = 0.0095
Thus, x 1 = 2.54 + 0.2363 = 2.7763 and y1 = 2.5495
Let the next approximation be (x2, y2) such that x2 = x1 + h and
y2 = y1 + k.
So, f1(x1, y1) = –1.7921
f2(x1, y1) = –1.7922
∂f1 ∂f ∂f
= 5.5526; 1 = –5.099; 2 = 5.5526
∂x1 ∂y1 ∂x2

Self-Instructional Material 157

Solution of Algebraic and
Transcendental Equations ∂f 2
and = 5.099
∂y2
Then, the Equations (1) and (2) become,
NOTES
5.5526h – 5.099k = +1.7921
5.5526h + 5.099k = 1.7922
On solving these two, you get, h = 0.3228 and k = 0.0000098 ~ 0.000010.
Then, x 2 = x1 + h and y2 = y1 + k
x21 = 2.7763 + 0.3228 = 3.0991
and, y2 = 2.5495 + 0.00001 = 2.54951 ~ 2.5495.
For third iteration (x3, y3), you have,
f1(x2, y2) = 0.1045
f2(x2, y2) = 0.1044
∂f1 ∂f1 ∂f 2 ∂f 2
= 6.1982; = –5.099; = 6.1982; = 5.099
∂x2 ∂y2 ∂x2 ∂y2

and the Equations (1) and (2) become,

6.1982h – 5.099k = –0.1045
6.1982h + 5.099k = –0.1044
giving h = –0.000008 and k = 0.0000098
Hence, x3 = 3.099092 and y3 = 2.5495098.
4.9.4 Generalized Newton’s Method for Multiple Roots
Let f(x) = 0 be an equation. If α is the root of f(x) = 0 with multiplicity m, i.e., root
α is repeated m-times, then,
f ( xn )
xn + 1 = xn – m .
f ′( xn )

this is called the generalized Newton’s method when m =1, it reduces to,
f ( xn )
xn + 1 = xn −
f ′( xn )

which is the normal Newton–Raphson method.

Example 4.11: Find the double root of the equation x3 – 7x2 + 16x – 12 = 0
correct upto three decimal places using Newton’s method.
Solution: Let α be the root of the equation f(x) = 0 with multiplicity m. Then it is
also a root of f ′(x) = 0 with multiplicity (m – 1) and of f ′′(x) = 0 with multiplicity
(m – 2) and so on. Hence, if the initial approximation x0 is sufficiently close to α
(i.e., the actual root), then the expressions
158 Self-Instructional Material
Solution of Algebraic and
f ( x0 ) f ′( x0 ) f ′′( x0 ) Transcendental Equations
x0 − m , x0 − ( m − 1) , x0 − (m − 2) , ...
f ′( x0 ) f ′′( x0 ) f ′′( x0 )

will have the same value.

NOTES
By this conclusion, let α be the double root of x3 – 7x2 + 16x – 12 = 0,
then α is the root of f ′(x) = 0 and the root of f ′′(x) = 0.
Let, x0 = 1
then, f(x0 ) = –2, f′(x0) = 3 x02 – 14x0 + 16
and f ″(x0 ) = 6x0 – 14 = –8
then the expressions,
f ( x0 ) −2
x 1 = x0 − m =1− 2× = 1.80
f ′( x0 ) 5
f ′( x0 ) 5
and x 1 = x0 − (m − 1) = 1 − (1) × = 1.625
f ′′( x0 ) −8
The closeness of these two values indicates that there is a double root at
x = 1.6.
Let, x 0 = 1.6, then,
0f (x ) −0.224
x 1 = x0 − m ′ = 1.6 − 2 × = 1.95
f ( x0 ) 1.28
f ′( x0 ) 1.28
and x 1 = x0 − (m − 1) = 1.6 − (2 − 1) × = 1.8909.
f ′′( x0 ) −4.4
The closeness of these values indicate that there is a double root at x = 1.9
Let, x 0 = 1.9, then,
f ( x0 ) −0.011
x 1 = x0 − m = 1.9 − 2 × = 1.99565.
f ( x0 )
′ 0.23
f ′( x0 ) 0.23
and x 1 = x0 − (m − 1) = 1.9 − 1 × = 1.9885
f ′′( x0 ) −2.6
The closeness shows that there is a double root at x = 1.99 which is quite
near the actual root x = 2.

CHECK YOUR PROGRESS

4. What is Secant Method?
5. What is the condition for Convergence of Iteration Method?
6. What is Newton-Raphson Method?

Self-Instructional Material 159

Solution of Algebraic and
Transcendental Equations 4.10 SUMMARY

In this unit, you have learned that:

NOTES • Analytical methods solve algebraic equations of two or three or four degree.
• If f(x) = a0 + a1 x + a2 x 2 + ... + an x n is a polynomial of degree n, where
a’s are constants that a ≠ 0 and n is a positive integer.
• If a + √b is an irrational root of an equation f(x) = 0, then a – √b must also
be the root of equation f(x) = 0.
• There are two kinds of methods to find the solution of Algebraic and
Transcendental equations. These are:
o Direct Method
o Iterative Method
• Direct method gives the roots of the equation f(x) = 0 in finite number of
steps.
• Iterative method is based on the idea of successive approximations.
• The most important step in iterative method is to choose the right initial
guess.
• The fastness of convergence to the root is represented by its rate of
convergence.
• Bisection method is based on the repeated application of the Intermediate
Value Theorem.
• The bisection method is a root-finding algorithm which repeatedly bisects
an interval then selects a subinterval in which a root must lie for further
processing.
• It is a very simple and robust method, but it is also relatively slow.
• Regula—Falsi method is a root-finding algorithm that combines features
from the bisection method and the secant method.
• Regula–Falsi or method of false position is the oldest method of finding
roots of the equation f(x) = 0.
• Secant method is a root-finding algorithm that uses a succession of roots of
secant lines to better approximate a root of a function f.
• The graph of y = f(x) is approximated by a secant line but at every iteration
the two most recent approximations are used to find out the next
approximation.
• Iterative method attempts to solve a problem (for example, finding the root
of an equation or system of equations) by finding successive approximations
to the solution starting from an initial guess.

4.11 KEY TERMS

• Bisection method: It is a root-finding algorithm which repeatedly bisects
an interval then selects a subinterval in which a root must lie for further
processing.
160 Self-Instructional Material
• Regula—Falsi method: It is a root-finding algorithm. Solution of Algebraic and
Transcendental Equations
• Secant method: Secant method is a root-finding algorithm that uses a
succession of roots of secant lines to better approximate a root of a function f.
• Iteration method: It attempts to solve a problem by finding successive
approximations to the solution starting from an initial guess. NOTES

4.12 ANSWERS TO ‘CHECK YOUR PROGRESS’

1. The two methods for finding solutions of Algebraic and Transcendental
equations are:
(i) Direct Method
(ii) Iterative Method
2. The main step in iterative methods is to choose the right approximation.
This step is basically termed as initial guess.
3. The fastness of convergence to the root in any method is represented by its
rate of convergence.
4. Secant Method is the graph of y = f(x). It is approximated by a secant line
and at every iteration the two most recent approximations are used to find
out the next approximation.
5. The condition for Convergence of Iteration method is that it is not true that
every sequence of x0, x1, x2 ... will converge to the root.
6. Let y = f(x) or f(x) = 0 be the equation and let x = x0 be the initial
approximation to the root.
Let x = x1 be the exact root, then f(x1) = 0 and x1 = x0 + h.
Where, h is a small positive number known as step size.
⇒ f(x0 + h) = 0 (as f(x1) = 0)
Expanding by Taylor’s theorem, you have,
h2 h3
f ( x0 ) + hf ′( x0 ) +f ( x0 ) +
′′ f ′′′( x0 ) + ... = 0
2! 3!
Since h is a very small number, so, you can neglect the terms involving h2,
h3, ... .
So, after neglecting the higher order terms, you have,
f ( x0 ) + hf ′( x0 ) ~ 0
f ( x0 )
or h =–
f ′( x0 )
f ( x0 )
Thus, x 1 = x0 + h = x0 –
f ′( x0 )
f ( x0 )
so, x1 = x0 − gives the better approximation to the root of f(x) = 0 then
f ′( x0 )
x0 .
Self-Instructional Material 161
Solution of Algebraic and Similarly, if x2 denotes a better approximation than x1, you have,
Transcendental Equations
f ( x1 )
x 2 = x1 −
f ′( x1 )
NOTES Continue in this manner, you have
f ( xn )
xn + 1 = xn −
f ′( xn )
This is known as Newton–Raphson method.

4.13 QUESTIONS AND EXERCISES

Short-Answer Questions
1. Write intermediate value theorem in brief.
2. What is the sighest power of x in a quintic equation expressed in terms of x.
3. What are methods to find solution to algebraic and transcendental equation?
4. What is bisection method?
5. What is the equation used for computing by Newton –Raphson method?
Long-Answer Questions
1. Find a root of the following equations, using the Bisection method correct
upto three decimal places:
(i) x log10x = 1.2 (ii) 2x3 + x2 – 20x + 12 = 0
(iii) x3 + x – 1 = 0 (iv) cos x = 3x – 1
2. Using Regula–Falsi method find the real root of the following equations
correct to four decimal places:
(i) 2x – logx = 6 (ii) x6 – x4 – x3 – 1 = 0
(iii) x3 – 4x – 9 = 0 (iv) xex = 2
3. Evaluate 15 by Iteration method correct upto four decimal places.
4. Find the roots of the following equations correct to three decimal places
using Secant method:
1
(i) x – e–x = 0 (ii) x = + sin x
2
5. Apply Iteration method to find the negative root of the equation
x3 – 4x + 7 = 0 correct upto four decimal places.
6. Find the root of the equation 2x = cos x + 3 correct upto three decimal
places using (i) Iteration method and (ii) Regula–Falsi method.
1 1
7. Use the following formula for to find correct upto four decimal
a 5
162 Self-Instructional Material
Solution of Algebraic and
(3 − axn2 ) xn Transcendental Equations
places: xn + 1 =
2
Show that the convergence is of order two.
8. Apply Newton–Raphson method to find roots of the following equations: NOTES
(i) x = 1 + sin x (ii) x4 + x2 – 80 = 0
(iii) x3 + x2 – 100 = 0 (iv) ex cos x = 2.5
9. Solve the given non-linear equations using Newton–Raphson method.
(i) x2 – y2 + x = 5 (ii) x2 + y2 = 4
(iii) 2xy + y = 5 (iv) ex + y = 1
10. Find the cube root of 41 correct upto four decimal places.
11. Solve x4 – 5x3 + 20x2 – 40x + 60 = 0, by Newton–Rephson method to
find a pair of complex roots.
12. Use Newton–Raphson method to find the smallest root of the equation
ex sin x = 1.
13. Find the negative root of the equation x3 – 21x + 3500 = 0 correct upto
three decimal places using Newton–Raphson method.
14. Using Newton–Raphson method find a root of the following equations
correct upto three decimal places:
(i) ex = x3 + cos 25x which is near 4.5
(ii) x sin x + cos x = 0 which is near π
(iii) x4 – x – 10 = 0
15. Find a double root of the equation x3 – x2 – x + 1 = 0 using Newton–
Raphson method. correct up to three decimal places only.

4.14 FURTHER READING

Mott, J.L. Discrete Mathematics for Computer Scientists, 2nd edition. New
Delhi: Prentice-Hall of India Pvt. Ltd., 2007.
Bonini, Charles P., Warren H. Hausman, and Harold Bierman. 1986. Quantitative
Analysis for Business Decisions. Illinois: Richard D. Irwin.
Charnes, A., W.W. Cooper, and A. Henderson. 1953. An Introduction to Linear
Programming. New York: John Wiley & Sons.
Gupta, S.C., and V.K. Kapoor. 1997. Fundamentals of Mathematical Statistics.
9th Revised Edition, New Delhi: Sultan Chand & Sons.

Self-Instructional Material 163

Numerical Solution to

UNIT 5 NUMERICAL SOLUTION Ordinary Differential

Equations

TO ORDINARY
NOTES
DIFFERENTIAL EQUATIONS
Structure
5.0 Introduction
5.1 Unit Objectives
5.2 Picard’s Method of Successive Approximations
5.3 Taylor’s Series Method
5.4 Euler’s Method
5.5 Runge–Kutta Method
5.5.1 Runge–Kutta Second Order Method
5.5.2 Runge–Kutta Fourth Order Method
5.6 Predictor–Corrector Methods
5.6.1 Modified Euler’s Method
5.6.2 Milne’s Predictor–Corrector Method
5.7 Summary
5.8 Key Terms
5.9 Answers to ‘Check Your Progress’
5.10 Questions and Exercises
5.11 Further Reading

5.0 INTRODUCTION

In this unit, you will learn that an ordinary differential equation is an essential tool
for modelling many physical problems in science and engineering. Also many
scientific laws are more readily expressed in terms of rate of change.
There are analytical methods to solve differential equations but these methods
are applicable to a limited class of differential equations. Thus, numerical methods
are useful in solving all these kinds of differential equations.
The first order differential equation, is as follows:
dy
= f(x, y) given y(x0) = y0
dx
The solution of differential equation is the function that satisfies the differential
equation and also certain initial conditions on the function. Here, some numerical
methods to find the solution of differential equations will be discussed.

Self-Instructional Material 165

Numerical Solution to
Ordinary Differential 5.1 UNIT OBJECTIVES
Equations

After going through this unit, you will be able to:

NOTES • Understand Picard’s method of successive approximations
• Describe Taylor’s series method
• Discuss Euler’s method
• Explain Range–Kutta’s method
• Know the predictor-corrector method
• Understand Adams–Bashforth method
• Know the system of differential equations

5.2 PICARD’S METHOD OF SUCCESSIVE

APPROXIMATIONS

dy
Let, = f(x, y); y(x0) = y0 ...(5.1)
dx
It is an ordinary differential equation of first order.
Integrating Equation (5.1) from x0 to x with respect to x, we get,
y x
∫ dy = ∫ f ( x, y ) dx
y0 x0
x

or, y – y0 = ∫ f ( x, y ) dx
x0

or, y = y0 + ∫ f ( x, y ) dx ...(5.2)
x0

Let y1(x) be the first approximation of the solution of Equation (5.1), then
the second approximation y2(x) can be obtained as
x
y2(x) = y0 + ∫ f ( x, y1 ( x)) dx
x0

x
Similarly, yn + 1(x) = y0 + ∫ f ( x, yn ( x)) dx
x0

Example 5.1: Find the successive approximate solution of the initial value problem
dy
= xy + 1, with y(0) = 1, by Picard’s method.
dx
166 Self-Instructional Material
Numerical Solution to
dy
Solution. We have, = xy + 1 or y′ = xy + 1 with y(0) = 1 Ordinary Differential
dx Equations

The first approximation by Picard’s method is,

x NOTES
y(1) x = y(0) + ∫ ( xy[0] + 1) dx
0

x2
x
y x = 1 + ∫ ( x + 1) dx = 1 + x +
(1)

0
2

The second approximation by Picard’s method is,

x
y(2)x = y(0) + ∫ ( xy [1] + 1) dx
0

x
 x2  
= 1 + ∫  
x 1 + x +
2
 + 1 dx
 
0

x
x3  x 2 x3 x 4
= 1 + ∫ 
2
x + x + + 1  dx = 1 + x + + +
0
2  2 3 8

Similarly, the third approximation by Picard’s method is,

x
 x 2 x3 x 4  
y x = y(0) + ∫  x 1 + x +
(3) + +  + 1 dx
0
  2 3 8  

x 2 x3 x 4 x5 x 6
=1+x+ + + + +
2 3 8 15 48
Example 5.2: Find four successive approximate solutions for the following initial
value problem: y′ = x + y, with y(0) = 1, using Picard’s method. Hence compute
y(0.1), y(0.2) and y(0.3), correct upto four significant digits.
Solution. We have, y′ = x + y with y(0) = 1
The first approximation is given by,
x
y x = y(0) + ∫ [ x + y(0)] dx
(1)

x2
x
= 1 + ∫ (1 + x) dx = 1 + x +
0
2

Self-Instructional Material 167

Numerical Solution to Second, third and fourth approximations are,
Ordinary Differential
Equations
 x
x2  2 x
3
y (x) = 1 + ∫  x + 1 + x +  dx = 1 + x + x +
(2)

0
 2  6
NOTES
 x
x3  x3 x 4
y (x) = 1 + ∫  x + 1 + x + x +  dx = 1 + x + x + +
(3) 2 2

0
 6  3 24

 x
x3 x 4 
y x = 1 + ∫
2
and (4) 1 + 2 x + x + +  dx
0
 3 24 

x3 x 4 x5
= 1 + x + x2 + + +
3 12 120
Hence, the values of y(0.1), y(0.2) and y(0.3) are ,

(0.1)3 (0.1)4 (0.1)5

y(0.1) = 1 + (0.1) + (0.1)2 + + + ~ 1.110
3 12 120
(Correct upto four significant digits)

2 (0.2)3 (0.2)4 (0.2)5

y(0.2) = 1 + (0.2) + (0.2) + + + ~ 1.243
3 12 120

(0.3)3 (0.3)4 (0.3)5

y(0.3) = 1 + (0.3) + (0.3)2 + + + ~ 1.400
3 12 120
Example 5.3: Find y(0.5) and y(1.0) correct to four decimal places by solving
the following initial value problem using Picard’s method.
y′ = x2/(1 + y2) with y(0) = 0.

x2
Solution. Given that y′ = with y(0) = 0. Using Picard’s successive
1+ y 2
approximation method the first approximation is obtained as,

 x2  x
y x = y(0) + ∫ 
(1)
2  dx
0 
 1 + y (0) 

x2 x3
x
= 0+∫ . dx =
0
1+ 0 3

168 Self-Instructional Material

The second approximation is, Numerical Solution to
Ordinary Differential
Equations
 x
x2 
y (x) = 0 + ∫ 
(2)
(1) 2 
dx
0 
 1 + [ y x ] 
 NOTES
x
 x2  −1 x
3
= ∫ 6
dx = tan
0 1 + x 
3
 
9 
The third approximation is,

 x
x2 
y x = 0 + ∫
(3)
(2) 2 
dx

0 1 + [ y x ] 



x
x2 
= ∫ 2
dx which is not integrable.
0  −1 x3  
1 +  tan  
  6  
So, in computing y(0.5) and y(1.0) we will make use of y(2)x only.

−1  (0.5) 
3
Hence, y (0.5) = tan 
(2)
 ~ 0.0416
 3 

 13 
and y(2)(1.0) = tan–1   ~ 0.3218
3
Example 5.4: Find three successive approximate solutions of the following initial
value problem, by Picard’s method. Hence, compute y(0.2) and y(0.4) correct
upto four decimal places.
y′ = x2 – y2, y(0) = 1
Solution. The first three approximations are given as,

x3 x3
x
y x = 1 + ∫ ( x 2 − 1) dx = 1 +
(1)
− x or 1 − x +
0
3 3

 2x
x3  x 2 x3 x 4
y x = 1 + ∫  x − 1 + x −  dx = 1 − x +
(2) + −
0
 3  2 3 12

 2x
x 2 x3 x 4 
and y x = 1 + ∫  x −1 + x −
(3) − +  dx
0
 2 3 12 

Self-Instructional Material 169

Numerical Solution to
Ordinary Differential x
 x 2 x3 x 4 
Equations = 1 + ∫  −1 + x + − +  dx
0
 2 3 12 

NOTES x 2 x3 x 4 x5
= 1− x + + − +
2 6 12 60
Hence,
(0.2) 2 (0.2)3 (0.2) 4 (0.2)5
y(0.2) = 1 – (0.2) + + − + ~ 0.8212
2 6 12 60

(0.4) 2 (0.4)3 (0.4) 4 (0.4)5

and y(0.4) = 1 – (0.4) + + − + ~ 0.6887
2 6 12 60

5.3 TAYLOR’S SERIES METHOD

dy
Consider, = f(x, y); y(x0) = y0
dx
Expanding this function using Taylor series around x = x0, we get:
( x − x0 )2
y(x) = y ( x0 ) + ( x − x0 ) y′( x0 ) + y′′( x0 ) + ... .
2!
where y′(x0), y′′(x0) are the first order and second order derivatives and so on.
Then, at x = x0 + h, the expansion becomes
h2
y(x0 + h) = y(x0) + h . y′(x0) + y′′( x0 ) + ... .
2
which gives the value of x at x0 + h, where h is the step size.
Note: Taylor series method can be applied to those functions which are not
complicated enough to calculate the derivatives.
Example 5.5: From the Taylor series solution of the initial value problem, y′ =
x2 – y2, y(0) = 0 up to seven terms and hence compute y(0.5) and y(1.0), correct
upto four decimal places.
Solution. Given that y′ = x2 – y2 with y(0) = 0.
Differentiating successively we get,
clearly, y′(0) = 0
Now, y″ = 2x – 2y.y′
= 2x – 2y . (x2 – y2)
Hence, y″(0) = 2.(0) – 2(0).(02 – 02) = 0
Similarly,
y′″ = 2 – 2[y.y″ + (y′)2]
170 Self-Instructional Material
so y′′′(0) = 2 Numerical Solution to
Ordinary Differential
iv Equations
yiv = –2(yy″′ + 3y′y″) and y (0) = 0
y v = −2( yy iv + 4 y′y′′ + 3( y′′) 2 ) and y v (0) = 0. NOTES

yvi = −2( yy (v) + 5 y′y (iv) + 10 y′′y′′′) and y vi (0) = 0.

yvii = −2( yy (vi) + 6 y′y (v) + 15 y′′y (iv) + 10( y′′′)2 ]
Hence, yvii = –80.
Thus, the Taylor series upto seven terms contains only two terms and is
given as:
x3 x7
y(x) = . y′′′(0) + . y vii (0)
6 7
x3 x7
so, y(x) = .(2) + (−80)
6 7
x3 80 x7
or, y(x) = −
3 7
(0.5)3 80 × (0.5)7
Hence, y(0.5) = − ~ – 0.0476
3 7
13 80 × (1)7
and y(1.0) = − ~ − 11.0952
3 7
Example 5.6: Given y′ = x + y, y(0) = 1. Evaluate y(0.1) and y(0.2) correct upto
four decimal places using Taylor series method.
Solution. Given that y′ = x + y with y(0) = 1.
The successive derivatives are obtained as
y″ = 1 + y′ = 1 + x + y so y″(0) = 2.
(as y′(0) = 1)
y″′ = y″= 1 + y′ so y″′(0) = 2.
iv
y = y″′ so yiv(0) = 2
similarly, yv = yiv, yvi = yv and so on
vi vii
also, y v (0) = y (0) = y (0) = ... = 2
Thus, the Taylor series upto five terms is obtained as
x2 x3 x5 v
y(x) = y(0) + xy(0)
′ + y′′(0) + y′′′(0) + ... + y (0) + ...
2! 3! 5!
x2 x3 x4 x5
~ 1+ x + .(2) + .(2) + .(2) + .(2)
2! 3! 4! 5!
Self-Instructional Material 171
Numerical Solution to
2 3 4 5
y(0.1) ~ 1 + (0.1) + (0.1) .(2) + (0.1) .(2) + (0.1) .(2) + (0.1) .(2)
Ordinary Differential
Equations
2! 3! 4! 5!
~ 1.1103 .
NOTES
2 3 4 5
y(0.2) ~ 1 + (0.2) + (0.2) .(2) + (0.2) .(2) + (0.3) .(2) + (0.3) .(2)
2! 3! 4! 5!
~ 1.2997

CHECK YOUR PROGRESS

1. What is the first order differential equation?
dy
2. Find the first approximation of the initial value problem = xy + 1 or
dx
y´ = xy + 1 with y(0) = 1, by Picard’s method.
dy
3. What will be the result of expanding the function = f(x,y); y(x0) = y0
dx
using Taylor series around x = x0?

5.4 EULER’S METHOD

Euler’s method can be described as a technique of developing a piecewise linear

approximation to the solution. It is the oldest and the simplest method. In the initial
value problem the starting point of the solution curve and the slope of the curve at
the starting point are given. The method uses this information and extrapolates the
solution curve using the specified step size.
True Value
Y T or
Err
T1 SN Approximate
Value of y
RN
S2
S1
R2
θR
(X0 , y 0)S 1

O X0 X 1 X2 XN X
x0 x0 + h x0 + 2h x0 + nh

Figure 5.1 Graphical Representation of Euler’s Method

172 Self-Instructional Material

Consider the differential Equation as, Numerical Solution to
Ordinary Differential
dy Equations
= f(x, y) ; y(x0) = y0
dx
Let us divide X0XN into n-subintervals each of step size h as X0, X1, ..., XN NOTES
so that h is very small.
In the interval X0X1 we approximate the curve by a tangent at S.
then, y1 = X1S1 = X1R1 + R1S1
= y0 + SR1. tan θ
 dy 
or y 1 = y0 + SR1 .  
 dx at S
= y0 + h . f(x0, y0)
dy
(i) as at S = (x0, y0)2 is f(x0, y0)
dx
(ii) SR1 is the step size between X0 and X1, i.e., h
So, y1 = y0 + h . f ( x0 , y0 )
Repeating this process n-times, we finally reach on an approximation
XN SN of XNT given by,
yn = yn − 1 + h f ( x0 + ( n − 1) h, yn − 1 )

Example 5.7: Compute values of y at x = 0.02, by Euler’s method taking

h = 0.01, given y is the solution of the following initial value problem.
dy
= x3 + y with y(0) = 1
dx
Solution. Euler’s method for solving an initial value problem is
dy
= f(x, y), y(x0) = y0 is
dx
yn + 1 = yn + hf(xn, yn), n = 0, 1, 2, ...
Taking h = 0.01, we have x1 = 0.01 and x2 = 0.02.
Using Euler’s method, we have
y(0.01) = y0 + hf(x0, y0) (Since y(0) = 1)
= y0 + 0.01 (x3(0) + y(0))
= 1 + 0.01 (03 + 1) = 1.0100.
and y(0.02) = y1 + hf (x1, y1)
= 1.0100 + 0.01 × ((0.01)3 + 1.0100)
= 1.0201
Self-Instructional Material 173
Numerical Solution to
Ordinary Differential
Thus, y (0.02) = 1.0201
Equations
Example 5.8: Using Euler’s method, solve for y at x = 0.1 from y′ = x + y + xy,
y(0) = 1 taking step size h = 0.025.
NOTES Solution. Given that y′ = x + y + xy ≡ f(x, y)
and y(0) = 1, h = 0.025, x = 0.1
x − x0 0.1 − 0
n= = =4
h 0.025
Hence, we divide the interval (0, 0.1) into 4 step with step size h = 0.025.
The various calculation are arranged as follows:
xn yn y′′ n = yn + 1 = yn + h(y′′ n)
x + y + xy
0.0 1 1 y 1 = y0 + h(y′0) = 1 + 0.025 × y′ = 1.0250
0.025 1.0250 1.0756 y 2 = y1 + h(y′1) = 1.0519
0.050 1.0519 1.1545 y 3 = y2 + h(y′2) = 1.0808
0.075 1.0808 1.2368 y 4 = y3 + h(y′3) = 1.1117
0.10 1.1117 1.3229 y 5 = y4 + h(y′4) = 1.1448

So, at x = 0.1 the value of y = 1.1448 .

5.5 RUNGE–KUTTA METHOD

The Runge–Kutta method derived by C. Runge and extended by W. Kutta are

the family of methods of which the second order and fourth order methods are
widely used. In these methods, first the slope at some of the intermediate points is
computed and then weighted average of slopes is used to extrapolate the new
solution point.
5.5.1 Runge–Kutta Second Order Method
dy
Consider = f(x, y), ; y(x0) = y0 as the differential equation.
dx
Runge–Kutta second order method, matches the Taylor series method up
to the second-degree terms in h, where h is the step size.
At (x0, y0), the initial point, calculate the slope of the curve as f(x0, y0). Let
it be k1. Now compute the curve at point (x1, y0 + k1h) as f(x1, y0 + k1h) where
x1 = x0 + h. Let this new slope be k2. Find the average of these slopes k1 and k2
and then compute the value of the dependent variable y from the following equation
k1 + k2
y 1 = y0 + kh. where k =
2
174 Self-Instructional Material
Thus, the Runge–Kutta second order method becomes, Numerical Solution to
Ordinary Differential
Equations
h
y 1 = y0 + (k + k2) = y0 + h.k
2 1
where, k 1 = f(x0, y0) NOTES
k 2 = f(x0 + h, y0 + k1h)
5.5.2 Runge–Kutta Fourth Order Method
dy
Let = f(x, y); y(x0) = y0 be a differential equation.
dx
Then, calculate
k 1 = f(x0, y0)
 h h
k 2 = f  x0 + , y0 + k1 
 2 2
 h h
k 3 = f  x0 + , y0 + k2 . 
 2 2
and k 4 = f ( x0 + h, y0 + k3h)
1
Finally, k = ( k1 + 2k 2 + 2k3 + k 4 )
6
which gives the required approximate value as,
y1 = y0 + k .h

Example 5.9: Using Runge–Kutta method of order 4, compute y(0.2) and y(0.4)
dy
from = x2 + y2, y(0) = 1 taking h = 0.1.
dx
dy
Solution. Given that = x2 + y2 ≡ f(x, y)
dx
Taking h = 0.1, calculating y(0.2) in two steps as follows:
Step 1: x0 = 0, y0 = 1, h = 0.1
so, k 1 = hf(x0, y0) = 0.1(02 + 12) = 0.1

 h k1    0.1 2  0.1 2 
k 2 = hf  0
x + , y0 +  = 0.1    + 1 +   = 0.1105
 2 2  2   2  

 h k2   2  0.1105  
2
k 3 = hf  x0 + , y0 +  = 0.1  0.05 + 1 +
( )   = 0.1116
 2 2   2  
k 4 = hf(x0 + h, y0 + k3) = 0.1[(0.1)2 + (1 + 0.1116)2] = 0.1246

Self-Instructional Material 175

Numerical Solution to
Ordinary Differential 1
Equations k = ( k1 + 2k2 + 2k3 + k4 ) = 0.1115
6
giving y(0.1) = y0 + k = 1 + 0.1115 = 1.1115
NOTES Step 2: x1 = x0 + h = 0.1, y1 = 1.1115, h = 0.1
∴ k 1 = hf(x1, y1) = 0.1[(0.1) + (1.1115)2] = 0.1245
2

 h k1 
k 2 = hf  x1 + , y1 + 
 2 2

 2
0.1   0.1245  
2
= 0.1  0.1 +  + 1.1115 +   = 0.1400
 2   2  

 h k2 
k 3 = hf  x1 + , y1 + 
 2 2

 2
0.1   0.1400  
2
= 0.1   0.1 +  + 1.1115 +   = 0.1418
 2   2  

k 4 = hf ( x1 + h, y1 + k3 )
2 2
= 0.1[(0.1 + 0.1) + (1.1115 + 0.1418) ] = 0.1611

1
k = ( k1 + 2k 2 + 2k3 + k4 ) = 0.1415
6
giving y(0, 2) = y1 + k = 1.1115 + 0.1415 = 1.2530.
Now to calculate y(0.4) by taking.
x 0 = 0.2, y0 = 1.2530 and h = 0.1.
In two steps as follows:
Step 1: x0 = 0.2, y0 = 1.2530, h = 0.1
k 1 = hf(x0, y0) = 0.1[(0.2)2 + (1.2530)2] = 0.1610

 h k1 
k 2 = hf  x0 + , y0 + 
 2 2

 2
0.1   0.1610  
2
= 0.1  0.2 +  + 1.2530 +   = 0.1841
 2   2  

 h k2 
k 3 = hf  x0 + , y0 + 
 2 2

176 Self-Instructional Material

Numerical Solution to
 2
0.1   0.1841  
2 Ordinary Differential
= 0.1  0.2 +  +  1.2530 +   = 0.1872 Equations
 2   2  
k 4 = hf(x0 + h, y0 + k3) = 0.1[(0.2 + 0.1)2 + (1.2530 + 0.1872)2] NOTES
= 0.2164
1
Now, k = ( k1 + 2k 2 + 2k3 + k4 ) = 0.1867
6
giving y(0.3) = y0 + k = 1.2530 + 0.1867 = 1.4397
Step 2:
x1 = 0.3, y1 = 1.4397 and h = 0.1
so, k 1 = hf(x1, y1) = 0.1[(0.3)2 + (1.4397)2] = 0.2163

 h k1  2 2
k 2 = hf  x1 + , y1 +  = 0.1[(0.35) + (1.5478) ] = 0.2518
 2 2

 h k2  2 2
k 3 = hf  x1 + , y1 +  = 0.1[(0.35) + (1.5656) ] = 0.2574
 2 2
k 4 = hf(x1 + h, y1 + k3) = 0.1[(0.4)2 + (1.6971)2] = 0.3040
1
∴ k = ( k1 + 2k 2 + 2k3 + k 4 ) = 0.2565
6
giving y(0.4) = y1 + k = 1.4397 + 0.2565 = 1.6962.
Thus,
x = 0.1 y = 1.1115
x = 0.2 y = 1.2530
x = 0.3 y = 1.4397
and x = 0.4 y = 1.6962.
Example 5.10: Find by Runge–Kutta method of order 4 an approximate value of
dy
y at x = 0.8, given that y = 0.41 when x = 0.4 and = ( x + y) .
dx
Solution. Given x0 = 0.4, y0 = 0.41, h = 0.4

f(x, y) = ( x + y)

k 1 = hf(x0, y0) = 0.4 ( (0.4 + 0.41) ) = 0.36

 h k 
k 2 = hf  x0 + , y0 + 1 
 2 2
Self-Instructional Material 177
Numerical Solution to
Ordinary Differential  0.4   0.36 
Equations = 0.4  0.4 +  +  0.41 +  = 0.4363
 2   2 

NOTES  h k 
k 3 = hf  x0 + , y0 + 2 
 2 2 

= 0.4 ( 0.6 ) + ( 0.6282 ) = 0.4433

k 4 = hf ( x0 + h, y0 + k3 )
= 0.4 ( 0.8 ) + ( 0.8533) = 0.5143

1
k = ( k1 + 2k 2 + 2k3 + k4 ) = 0.4389
6
giving y(0.8) = y0 + k = 0.41 + 0.4389 = 0.8489.
dy
Example 5.11: Given that (y2 – 2x)/(y2 + x) and y = 1 at x = 0; find y for
dx
x = 0.1, 0.2, 0.3 and 0.4.
Solution. Runge–Kutta fourth order method is,
k 1 = hf(x0, y0)
 h k1 
k 2 = hf  x0 + , y0 + 
 2 2

 h k2 
k 3 = hf  x0 + , y0 + 
 2 2
and k 4 = hf(x0 + h, y0 + k3)
1
k = ( k1 + 2k 2 + 2k3 + k4 )
6
Thus, x 1 = x0 + h and y1 = y0 + k

y2 − 2x
Given that f(x, y) = , x0 = 0, y0 = 1 and h = 0.1
y2 + x

12 − 2 × 0 
Step 1: k 1 = hf(x0, y0) = 0.1  2  = 0.1
 1 +0 

 h k   (1.05)2 − 2 × 0.05 
k 2 = hf  x0 + , y0 + 1  = 0.1   = 0.087
 2 2 2
 (1.05) + 0.05 
178 Self-Instructional Material
Numerical Solution to
 h k   (1.0435)2 − 2 × 0.05  Ordinary Differential
k 3 = hf  x0 + , y0 + 2  = 0.1   = 0.0868 Equations
 2 2 2
 (1.0435) + 0.05 

 (1.0868) 2 − 2 × 0.05  NOTES

k 4 = hf (x0 + h, y0 + k3) = 0.1  2  = 0.0878
 (1.0868) + 0.05 

1
k = ( k1 + 2k 2 + 2k3 + k4 ) = 0.0892
6
y1 = y0 + k = 1 + 0.0892 = 1.0892
giving y(0.1) = 1.0892 at x = 0.1
Step 2: x 1 = 0.1 y1 = 1.0892 and h = 0.1
 (1.0892) 2 − 2 × 0.1 
k 1 = hf(x1, y1) = 0.1  2  = 0.0767
 (1.0892) + 0.1 

 h k   (1.1275)2 − 2 × 0.15 
k 2 = hf  x1 + , y1 + 1  = 0.1   = 0.0683
 2 2 2
 (1.1275) + 0.15 

 h k   (1.1234) 2 − 2 × 0.15 
k 3 = hf  x1 + , y1 + 2  = 0.1   = 0.0681
 2 2  2
 (1.1234) + 0.15 

 (1.1233)2 − 2 × 0.1 
k 4 = hf(x1 + h, y1 + k3) = 0.1  2  = 0.0780
 (1.1233) + 0.1 
1
k = ( k1 + 2k 2 + 2k3 + k 4 ) = 0.0713
6
so, y(0.2) = y1 + k = 1.0892 + 0.0713 = 1.1605
Step 3: x2 = 0.1, y2 = 1.1605 and h = 0.1

 (1.1605) 2 − 2 × 0.2 
k 1 = hf(x2, y2) = 0.1   = 0.0612
 (1.1605) 2 + 0.2 

 (1.1911) 2 − 2 × 0.25 
k 2 = hf  x2 + , y2 + 1  = 0.1 
h k
 = 0.0551
 2 2  (1.1911) 2 + 0.25 

 h k   (1.1880) 2 − 2 × 0.25 
k 3 = hf  x2 + , y2 + 2  = 0.1   = 0.0549
 2 2  2
 (1.1880) + 0.25 

Self-Instructional Material 179

Numerical Solution to
Ordinary Differential  (1.1879) 2 − 2 × 0.3 
Equations k 4 = hf ( x2 + h, y2 + k3 ) = 0.1   = 0.0474
 (1.1880) 2 + 0.3 

NOTES 1
k = ( k1 + 2k2 , 2k3 + k4 ) = 0.0548
6
so, y(0.3) = y2 + k = 1.1605 + 0.0548 = 1.2153
Step 4: x3 = 0.3, y3 = 1.2153 and h = 0.1

 (1.2153) 2 − 2 × 0.3 
k 1 = hf ( x3 , y3 ) = 0.1  2  = 0.0493
 (1.2153) + 0.3 

 h k   (1.24)2 − 2 × 0.35 
k 2 = hf  x3 + , y3 + 1  = 0.1   = 0.0444
 2 2 2
 (1.24) + 0.35 

 h k   (1.2375)2 − 2 × 0.35 
k 3 = hf  x3 + , y3 + 2  = 0.1   = 0.0442
 2 2  2
 (1.2375) + 0.35 

 (1.2595) 2 − 2 × 0.4 
k 4 = hf ( x3 + h, y3 + k ) = 0.1   = 0.0396
2
 (1.2595) + 0.4 

1
so, k = ( k1 + 2k 2 + 2k3 + k4 ) = 0.0444
6
giving y(0.4) = y3 + k = 1.2597.
Example 5.12: Using Runge–Kutta fourth order method find y for x = 0.1 and
0.2 given that y′ = xy + y2, y(0) = 1, taking h = 0.1. Given that y(0) = 1
Solution. Given that f(x, y) = xy + y2, x0 = 0, y0 = 1 and h = 0.1.
Step 1:
k 1 = hf(x0, y0) = 0.1[0 × 1 + 12] = 0.1

 h k 
k 2 = hf  x0 + , y0 + 1  = 0.1[(0.05 × 1.05) + (1.05) 2 ] = 0.1155
 2 2

 h k 
k 3 = hf  x0 + , y0 + 2  = 0.1[(0.05 × 1.0578) + (1.0578) 2 ] = 0.1172
 2 2 
k 4 = hf(x0 + h, y0 + k3) = 0.1[(0.05 × 1.1172) + (1.1172)2] = 0.1304
1
k = ( k1 + 2k2 + 2k3 + k 4 ) = 0.1160
6
giving y(0.1) = y0 + k = 1 + 0.1160 = 1.1160
180 Self-Instructional Material
Step 2: Numerical Solution to
Ordinary Differential
x 1 = 0.1, y1 = 1.1160 and h = 1 Equations

k 1 = hf(x1, y1) = 0.1[0.1 × 1.1160 + (1.1160)2] = 0.1357

NOTES
 k 
k 2 = hf  x1 + , y1 + 1  = 0.1 (0.15 ×1.1839) + (1.1839) 2  = 0.1579
h
 2 2

 k 
k 3 = hf  x1 + , y1 + 2  = 0.1 (0.15 ×1.1950) + (1.1950)  = 0.1607
h 2
 2 2
k 4 = hf(x1 + h, y1 + k) = 0.1[(0.2 × 1.2767) + (1.2767)2] = 0.1885
1
k = (k1 + 2k2 + 2k3 + k4 ) = 0.1602.
6
Giving y(0.2) = y1 + k = 1.1160 + 0.1602 = 1.2762
Hence, y(0.1) = 1.1160 at x = 0.1
and y(0.2) = 1.2762 at x = 0.2

CHECK YOUR PROGRESS

4. How would you describe Euler’ method?
5. Name the people who derived the Runge-Kutta method?
6. Which order of the Runge-Kutta family of methods is widely used?

5.6 PREDICTOR–CORRECTOR METHODS

The methods discussed so far are single step methods. Now you will learn about some
multi step methods which make use of the past information of the curve to extrapolate
the solution curve. Such methods are known as Predictor–Corrector methods.
5.6.1 Modified Euler’s Method
As in Euler’s method, the curve of solution in the interval X0X1 is approximated by
a target at S (See Figure 5.1) such that at S1, we have
y1 = y0 + hf(x0, y0) ...(5.3)
Now the slope of the curve of solution through S1 is,
 dy 
  = f(x0 + h, y1)
 dx  S1
and the tangent at S1 to S1T1 is drawn meeting the ordinate through X2 in S2
(x0 + 2h, y2). We can find a better approximation y1(i) of y(x0 + h) by taking the
slopes of the curve as the mean of the slopes of the tangents at S and S1, thus,
Self-Instructional Material 181
Numerical Solution to
h
Ordinary Differential y11 = y0 + [ f ( x0 , y0 ) + f ( x0 + h, y1 )] ...(5.4)
Equations 2
Since the slope at S1 is not known, we compute the value of y1 from Euler’s
NOTES method (i.e. Equation (5.3) and substitute it to the R.H.S of Equation (5.4) to
obtain the first modified value of y1(1) .
h (1)
Similarly, y1(2) = y0 + [ f ( x0 , y0 ) + f ( x0 + h, y1 )]
2
repeat this step till two consecutive values of y agree.
Once y1 is obtained to a desired degree of accuracy, compute,
y2 = y1 + hf(x0 + h, y1)
and a better approximation of y2, i.e. y2(1) is obtained as,
h
y2(1) = y1 + [ f ( x0 + h, y1 ) + f ( x0 + 2h, y2 )]
2
repeat this step till y2 is obtained with desired accuracy.
The modified Euler’s method can be rewritten as:

ynp + 1 = yn + hf ( xn , yn ) ...(5.5)
h p
and ync + 1 = yn + [ f ( xn , yn ) + f ( xn + 1, yn + 1 )] ...(5.6)
2
Equation (5.5) is known as Predictor formula and Equation (5.6) is known
as Corrector formula.
dy
Example 5.13: Solve = 1 – y, y(0) = 0 in the range 0 ≤ x ≤ 0.3 using
dx
modified Euler’s method by choosing step size 0.1. Also compare the obtained
result with the exact solution.
Solution. Given that y′ = 1 – y, y(0) = 0.
Modified Euler’s method is
ynp + 1 = yn + hf(xn, yn)
h
and ync + 1 = yn +  f ( xn , yn ) + f ( xn + 1 , ynp + 1 )  .
2
Taking h = 0.1, the calculations are as follows:
y1p = y0 + hf(x0, y0) = 0 + 0.1 × (1 – 0) = 0.1
h
y1c (1) = y0 +  f ( x0 , y0 ) + f ( x0 , y0p ) 
2
0.1
= 0+ [(1 − 0) + (1 − 0.1)] = 0.0950.
2
182 Self-Instructional Material
We will repeat this step untill y1 becomes stationary. So, with x0 = 0, Numerical Solution to
Ordinary Differential
y0p = 0.0950 Equations

0.1
y1c (2) = 0 + [1 + (1 − 0.0950)] = 0.0953
2 NOTES
0.1
y1c (3) = 0 + [1 + (1 − 0.0953)] = 0.0952
2
0.1
y1c (4) = 0 + [1 + (1 − 0.0952)] = 0.0952
2
Since y1c (3) and y1c (4) are same, we will stop here for x1 = x0 + h = 0 +
0.1 = 0.1 and y1 = 0.0952 (i.e. at x = 0.1)
Similarly, we will calculate for x = 0.2 and x = 0.3 also. The calculations are
arranged as shown in Table 5.1.
Table 5.1 Calculations

x y′ = 1 – y Mean Slope ync (New) = yn + h (Mean Slope)

f ( xn , yn ) + f ( xn , ynp )
=
2
0.0 1–0 — 0 + 0.1(1) = 0.1
1
0.1 1 – 0.1 (1 + 0.9) = 0.95 0 + 0.1(0.95) = 0.095
2
1
0.1 1 – 0.095 (1 + 0.905) = 0.9525 0 + 0.1(0.9525) = 0.0952
2
1
0.1 1 – 0.0952 (1 + 0.9048) = 0.9524 0 + 0.1(0.9524) = 0.0952
2
0.1 0.9048 — 0.0952 + 0.1 (0.9048) = 0.1857
1
0.2 1 – 0.1857 = 0.8143 (.9048 + .8143) = 0.8595 0.0952 + 0.1(0.8595) = 0.1812
2
1
0.2 1 – 0.1812 = 0.8188 (.9048 + 0.8188) = 0.8618 0.0952 + 0.1 (0.8618) = 0.1814
2
1
0.2 1 – 0.1814 = 0.8186 (0.9048 + 0.8186) = 0.8617 0.0952 + 0.1 (0.8617) = 0.1814
2
0.2 1 – 0.1814 = 0.8186 — 0.1814 + 0.1(0.8186) = .2634
1
0.3 1 – 0.2634 = 0.7366 (0.8186 + 0.7366) = 0.7776 0.1814 + 0.1(.7776) = 0.2592
2
1
0.3 1 – 0.2592 = 0.7408 (0.8186 + 0.7408) = 0.7797 0.1814 + 0.1(.7797) = 0.2594
2
1
0.3 1 – 0.2594 = 0.7406 (0.8186 + 0.7406) = 0.7796 0.1814 + 0.1(0.7796) - 0.2594
2
Hence, y(0.1) = 0.0952 
y(0.2) = 0.1814  (App.)
and y(0.3) = 0.2594 
Self-Instructional Material 183
Numerical Solution to Exact Solution: We have,
Ordinary Differential
dy dy
Equations = 1 – y, y(0) = 0 or = dx
dx 1− y
On integrating, we get
NOTES
–log(1 – y) + log c = x
c
or, = ex or ex(1 – y) = c
1− y
at x = 0, y = 0 thus e0(1 – 0) = c ⇒ c = 1, so the exact solution is,
ex(1 – y) = 1 or y = 1 − e − x
Now we can see that upto four decimal places,
x = 0.1 y = 1 – e–0.1 = 0.0952
x = 0.2 y = 1 – e–0.2 = 0.1813
and, x = 0.3 y = 1 – e–0.3 = 0.2592.
Example 5.14: Solve the following by modified Euler’s method:
y′ = log(y + x), y(0) = 2
in the range 0 ≤ x ≤ 0.4 with h = 0.1
Solution. The various calculations are arranged as shown in Table 5.2.
Table 5.2 Calculations Using Euler’s method

x y′ = log(x + y) Mean Slope yn(c+) 1 = yn + h(Mean Slope)

log ( xn + yn ) + log( xn + ynp )

=
2

0.0 log (0 + 2) = 0.3010 — 2 + 0.1 (.3010) = 2.0301

1
0.1 log(0 + 2.0301) (0.3010 + 0.3075) = 0.3043 2 + 0.1(0.3043) = 2.0304
2
= 0.3075
1
0.1 log(2.0304) = 0.3076 (0.301 + 0.3076) = 0.3043 2 + 0.1(0.3043) = 2.0304
2
0.1 log(0.1 + 2.0304) — 2.0304 + 0.1(0.3285) = 2.0632
= 0.3285
1
0.2 log(2.1632) = 0.3351 (0.3285 + 0.3351) = 0.3318 2.0304 + 0.1(0.3318) = 2.0636
2
1
0.2 log(2.1636) = 0.3352 (0.3285 + 0.3352) = 0.3318 2.0304 + 0.1(0.3318) = 2.0636
2
0.2 log(2.2636) = 0.3548 — 2.0636 + 0.1(0.3548) = 2.0991
1
0.3 log(2.2991) = 0.3616 (0.3548 + 0.3616) = 0.3582 2.0636 + 0.1(0.3582) = 2.0994
2
1
0.3 log(2.2994) = 0.3616 (0.3548 + 0.3616) 0.3582 2.0636 + 0.1(0.3582) = 2.0994
2
0.3 log(2.3994) = 0.3801 — 2.0994 + 0.1(0.3801) = 2.1374
1
0.4 log(2.4374) = 0.3869 (0.3801 + 0.3869) = 0.3835 2.0994 + 0.1(0.3835) = 2.1378
2
1
0.4 log(2.4378) = 0.3870 (0.3801 + 0.3870) = 0.3835 2.0994 + 0.1(0.3835) = 2.1378
184 Self-Instructional Material 2
So, att t x = 0.1 y = 2.0304 Numerical Solution to
Ordinary Differential
at at x = 0.2 y = 2.0636 Equations

at at x = 0.3 y = 2.0994
and x = 0.4 y = 2.1378 NOTES

5.6.2 Milne’s Predictor–Corrector Method

dy
Let = f(x, y) ; y(x0) = y0 be an ordinary differential equation.
dx
To compute the value of y we use the two formulae predictor and corrector as,
(i) Milne’s predictor formula
4h
yn + 1 = yn – 3 +
3
( 2 f n − 2 − f n − 1 + 2 f n ) ; n = 3, 4, ...
(ii) Milne’s corrector formula
h
( f n − 1 + 4 f n + f n + 1 ); n = 1, 2, ...
yn + 1 = yn – 1 +
3
By knowing four consecutive values of y as yn – 3, yn – 2, yn – 1 and yn we
calculate yn + 1 using predictor method and then use yn + 1 in the corrector formula
to find the value of yn + 1.
Example 5.15: Apply Milne’s method, to find a solution of the differential equation
y′ = x2 – y in the range 0 ≤ x ≤ 1 for the boundary condition y = 0 at x = 0.
Solution. Using Picard’s method, we have,
x
y = y(0) + ∫0 f ( x, y ) dx, given that f(x, y) = x2 – y

x x 2 x3
so, y 1 = y0 + ∫0 ( x 2 − y ) dx = 0 + ∫0
x dx =
3

x x x3  x3 x 4
y 2 = y0 + ∫ f ( x, y1 ) dx = 0 + ∫  x 2 −  dx = −
0 0 3 3 12
 

x x 2 x3 x 4 
y 3 = y0 + ∫ f ( x, y2 ) dx = 0 + ∫ x − +  dx
0 0 3 12 

x3 x 4 x5
= − +
3 12 60
Now we will determine the starting values of y for Milne’s method by taking
h = 0.2.

Self-Instructional Material 185

Numerical Solution to x 0 = 0, y0 = 0, f0 = 0
Ordinary Differential
Equations
(0.2)3 (0.2) 4 (0.2)5
x 1 = 0.2, y1 = − + = 0.0025, f 1 = 0.0375
3 12 60
NOTES
(0.4)3 (0.4) 4 (0.4)5
x 2 = 0.4, y2 = − + = 0.0194, f 2 = 0.1406
3 12 60

(0.6)3 (0.6) 4 (0.6)5

x 3 = 0.6, y3 = − + = 0.0625, f 3 = 0.2975
3 12 60
Now, Milne’s predictor–corrector formula is,
4h
y4p = y0 + (2 f1 − f 2 + 2 f3 )
3
h
y4c = y2 + ( f 2 + 4 f3 + f 4 )
3
using the predictor formula,
4 × 0.2
y4p = 0 + (2 × 0.0375 − 0.1406 + 2 × 0.2975)
3
= 0.1412
so, x 4 = 0.8, y4p = 0.1412 and f4 = 0.4988
Now using corrector formula,
0.2
y4c = 0.0194 + [0.1406 + 4 × 0.2975 + 0.4988]
3
= 0.1414.
Now repeat the above step with x4 = 0.8, y4 = 0.1414 and f4 = 0.4986
untill the two values of y4 are approximately equal.
0.2
so, y4c (1) = 0.0194 + [0.1406 + 4 × 0.2975 + 0.4986]
3
= 0.1413.
Thus, x 4 = 0.8, y4(1) = 0.1413, f4(1) = 0.4987
0.2
again, y4(c ) (2) = 0.0194 + [0.1406 + 4 × 0.2975 + 0.4987]
3
= 0.1414.
We can see that y4( c ) , y4( c ) (1) and y4( c ) (2) are approximately equal. Thus,

186 Self-Instructional Material

at x = 0.8, y = 0.1414 and f(0.8, 0.1414) = 0.4987 Numerical Solution to
Ordinary Differential
Now, we will calculate y for x = 1.0 taking x1, x2, x3, x4, y1, y2, y3, y4 and Equations
f1, f2, f3, f4 respectively and h = 0.2.
4h NOTES
Using the predictor formula, y5p = y1 + [ 2 f 2 − f3 + 2 f 4 ]
3

4 × 0.2
= 0.0025 + [ 2 × 0.1406 – 0.2975 + 0.4987 × 2]
3
= 0.2641
Now applying corrector formula,
h
y5c = y3 + ( f3 + 4 f 4 + f5 )
3
Here, x5 = 1.0, y5 = 0.2641 and f5 = 0.7359
0.2
so, y5c (1) = 0.0625 + [0.2975 + 4 × 0.4987 + 0.7359]
3
= 0.2644
Again, with x5 = 1.0, y5 = 0.2644 and f5 = 0.7356
0.2
y5c (2) = 0.0625 + [0.2975 + 4 × 0.4987 + 0.7356]
3
= 0.2644.

Clearly, y5c (1) = y5c (2) so, y5 (1.0) = 0.2644

Example 5.16: Solve the initial value problem,
dy
= 1 + xy 2 , y (0) = 1
dx
for x = 0.4 by using Milne’s method, when it is given that,
x: 0.1 0.2 0.3
y: 1.105 1.223 1.355
Solution. Milne’s predictor–corrector formula is,
4h
y4p = y0 + (2 f1 − f 2 + 2 f3 )
3

h
and y4c = y2 + ( f 2 + 4 f3 + f 4 )
3

Self-Instructional Material 187

Numerical Solution to Now given that,
Ordinary Differential
Equations x0 = 0 y0 = 1 f0 = 1
x 1 = 0.1 y1 = 1.105 f1 = 1.1221
NOTES x 2 = 0.2 y2 = 1.223 f2 = 1.2991
x 3 = 0.3 y3 = 1.355 f3 = 1.5508
Using predictor formula,
4 × 0.1
y4p = 1 + [2 × 1.1221 − 1.2991 + 2 × 1.5508]
3
= 1.5396
Now applying corrector’s formula,
0.1
y4c (1) = 1.223 + [1.2991 + 4 × 1.5508 = 1.5396]
3
= 1.5244.
Thus, f4(1) = 1.9295
0.1
f 4c (2) = 1.223 + [1.2991 + 4 × 1.5508 + 1.9295]
3
= 1.5374 and f4(2) = 1.9454
0.1
y4c (3) = 1.223 + [1.2991 + 4 × 1.5508 + 1.9454] = 1.5379
3
Thus, f4(3) = 1.9461
0.1
again, y4c (4) = 1.223 + [1.2991 + 4 × 1.5508 + 1.9461]
3
c
= 1.5379 ≅ y4 (3)
Thus, at x = 0.4, y = 1.5379 and f = 1.9461
Example 5.17: Use Milne’s method to find y(0.3) from y′ = x2 + y2, y(0) = 1.
Find the initial values y(–0.1), y(0.1) and y(0.2) from the Taylor’s series method.
Solution: We have y′ = x2 + y2 with y(0) = 1
∴ y′(0) = 1
y″(x) = 2x + 2y . y′ ∴ y″(0) = 2
2
y″′(x) = 2 + 2yy″ + 2y′ ∴ y″′(0) = 8
iv
y (x) = 2yy″′ + 2y′y″ + 4y′y″ ∴ yiv(0) = 28
yv(x) = 2yyiv + 8y′y″′ + 6y″2 ∴ yv(0) = 144

188 Self-Instructional Material

Thus, the Taylor’s series solution is given by, Numerical Solution to
Ordinary Differential
Equations
x2 x3
y(x) = y(0) + xy′(0) + y ′′(0) + y′′′(0) + ...
2! 3!
NOTES
2 3 4 5
x x x x
=1+x.1+ ×2 + ×8 + × 28 + ×144 + ...
2! 3! 4! 5!

4 x3 7 x 4 6 x 5
2
so, y(x) ~ 1 + x + x + + +
3 6 5
∴ y(–0.1) = 0.9088, y(0.1) = 1.1115 and y(0.2) = 1.2529.
Thus,
x 0 = – 0.1 y0 = 0.9088 f(0) = 0.8359
x1 = 0 y1 = 1 f1 = 1
x 2 = 0.1 y2 = 1.1115 f2 = 1.2454
and x 3 = 0.2 y3 = 1.2529 f3 = 1.6098
Using Milne’s predictor formula,
4h
y4p = y0 + [2 f1 − f 2 + 2 f3 ]
3

4 × 0.1
= 0.9088 + [2 × 1 – 1.2454 + 2 × 1.6098]
3
= 1.4387
Now using Milne’s corrector method,
h
y4c = y2 + [ f 2 + 4 f3 + f 4 ]
3
with x 4 = 0.3, y4 = 1.4387 and f4 = 2.1598
0.1
y4c (1) = 1.1115 + × [1.2454 + 4 × 1.6098 + 2.1598]
3
= 1.4396 and hence f4(1) = 2.1624
again,
0.1
y4c (2) = 1.1115 + [1.2454 + 4 × 1.6098 + 2.1624]
3
= 1.4397 with f4(2) = 2.1628

Self-Instructional Material 189

Numerical Solution to
Ordinary Differential 0.1
Equations
and y4c (3) = 1.1115 + [1.2454 + 4 × 1.6098 + 2.1628] = 1.4397.
3
Hence, y(0.3) = 1.4397 (correct up to 4 decimal places).
NOTES

CHECK YOUR PROGRESS

7. What are the Predictor–Corrector methods?
8. What is the main and the most important objective of numerical methods?

5.7 SUMMARY

In this unit, you have learned that:

• The Taylor series method can be applied to those functions which are not
complicated enough to calculate the derivatives.
• The Euler’s method can be described as a technique of developing a
piecewise linear approximation to the solution. It is the oldest and the simplest
method.
• In the initial value problem the starting point of the solution curve and the
slope of the curve at the starting point are given. Euler’s method uses this
information and extrapolates the solution curve using the specified step size.
• The Runge–Kutta method derived by C. Runge and extended by W.Kutta
are the family of methods of which the second order and fourth order methods
are widely used. In these methods, first the slope at some of the intermediate
points is computed and then weighted average of slopes is used to
extrapolate the new solution point.
• Runge–Kutta second order method, matches the Taylor series method up
to the second-degree terms in h, where h is the step size.
• Some multi-step methods which make use of the past information of the
curve to extrapolate the solution curve are known as Predictor-Corrector
methods.
• In Milne’s predictor–corrector method, by knowing four consecutive values
of y as yn – 3, yn – 2, yn – 1 and yn we calculate yn + 1 using predictor method
and then use yn + 1 in the corrector formula to find the value of yn + 1.
• In order to derive the expression for Adams–Bashforth method consider
the differential equation,
dy
= f(x, y) or dy = f(x, y)dx
dx

190 Self-Instructional Material

then, Adams–Bashforth predictor formula is, Numerical Solution to
Ordinary Differential
Equations
h
ynp + 1 = yn +  −9 f n − 3 + 37 f n − 2 − 59 f n − 1 + 55 f n 
24  
the corrector formula is, NOTES

h 
ync + 1 = yn + f n − 2 − 5 f n − 1 + 19 f n + 9 f np+ 1  .
24  
• The numerical solutions of differential equations certainly differ from their
exact solutions. The total error at any stage say ‘r’ is the difference between
the computed value yr and the true value y(xr).
Total Error = Truncation Error + Round-off Error
• The main and the most important objective of numerical methods is to
minimize the error and to obtain most appropriate solutions. Truncation
error can be reduced in any method by taking small subintervals whereas
the round-off error cannot be controlled easily unless the computer has the
double precision arithmetic facility.

5.8 KEY TERMS

• Euler’s method: It is a technique of developing a piecewise linear

approximation to the solution.
• Runge–Kutta method: It is the method in which first the slope at some of
the intermediate points is computed and then weighted average of slopes is
used to extrapolate the new solution point.
• Predictor–Corrector methods: These are multi step methods which make
use of the past information of the curve to extrapolate the solution curve.

5.9 ANSWERS TO ‘CHECK YOUR PROGRESS’

dy
1. The first order differential equation is = f (x,y) given y(xo) = yo.
dx
2. The first approximation by Picard’s method is,
x

y(1) x = y(0) + ∫ ( xy[0] + 1) dx

x2
x

y x =1+ ∫
(1) ( x + 1) dx = 1 + x +
2
0

Self-Instructional Material 191

Numerical Solution to 3. Expanding the function using Taylor series we get
Ordinary Differential
Equations
( x − x0 ) 2
y(x) = y ( x0 ) + ( x − x0 ) y′( x0 ) + y′′( x0 ) + ... .
2!
NOTES
where y′(x0), y′′(x0) are the first order and second order derivatives and so
on.
4. Euler’s method can be described as a technique of developing a piecewise
linear approximation to the solution.
5. The Runge–Kutta method was derived by C. Runge and extended by W.
Kutta.
6. It is the second and the fourth order methods that are widely used in the
Runge–Kutta family of methods.
7. Predictor–Corrector methods are those methods which make use of the
past information of the curve to extrapolate the solution curve.
8. The main and most important objective of numerical methods is to minimize
the error and obtain the most appropriate solutions.

5.10 QUESTIONS AND EXERCISES

Short-Answer Questions
1. Write equation used in Picard’s method of successive approximations for
differentiate equation of first order?
2. To which type of function Taylor Series method can be applied?
3. Define Euler’s method.
4. Write equation used in Runge-Kutta second order method.
5. Write formulae the Milne’s Predictor and corrector methods.

Long-Answer Questions
dy
1. Using Taylor’s Series method, solve = x2 – y, y(0) = 1 at x = 0.1, 0.2,
dx
0.3 and 0.4. Compare the values with the exact solution.
2. Use Picard’s method to approximate the value of y when x = 0.1 given that
dy y − x
y = 1 when x = 0 and = .
dx y + x
3. Use Picard’s method to find two successive approximate solution of the
initial value problem,
dy y − x
= , y (0) = 1
dx y + x

192 Self-Instructional Material

Numerical Solution to
dy 2 xy − e x
4. Solve = with y(1) = 0 for x = 1.2 and 1.4 by taking stepsize Ordinary Differential
dx x 2 + xe x Equations
h = 0.1.
dy 2x NOTES
5. Solve =y− , y (0) = 1 in the range 0 ≤ x ≤ 0.2 using.
dx y
(i) Euler’s method
(ii) Modified Euler method
6. Compute y(0.2) and y(0.4) correct upto three decimal places by solving
the following initial value problem using Picard’s method.
7. Compute the solution of the following initial value problem by Euler’s
method, for x = (0.1) correct to four decimal places, taking h = 0.02.
dy y − x
= , y (0) = 1
dx y + x
8. Given xy′ = x – y2, y(2) = 1, evaluate y(2.1), y(2.2), y(2.3) correct upto
four decimal places using Taylor series method.

9. Apply Milne’s method to find a solution of the differential equation,

dy
= x – y2 in the range 0 ≤ x ≤ 1 for the boundary condition y = 0 at
dx
x = 0. (Given h = 0.2)
10. Tabulate the solution of y′ = x2 + y, y(0) = 0 for 0.4 ≤ x ≤ 1.0 with h = 0.1,
using the Predictor–Corrector formula.
11. Given that,
x: 1 1.1 1.2 1.3
y: 1 1.233 1.548488 1.978921
dy
For the differential equation = x2(1 + y) determine y(1.4) using Milne’ss
dx
method.

d2y dy
12. Solve 2
+ x. + y = 0 , y(0) = 1 and y′(0) = 0 and obtain y for x = 0.1,
dx dx
0.2 and 0.3 by any method. Further apply Milne’s method to compute
y(0.4).

5.11 FURTHER READING

Mott, J.L. Discrete Mathematics for Computer Scientists, 2nd edition. New
Delhi: Prentice-Hall of India Pvt. Ltd., 2007.

Self-Instructional Material 193

Numerical Solution to Stockton, John R., and Charles T. Clark. 1980. Introduction to Business and
Ordinary Differential
Equations Economic Statistics. Ohio: South-Western Publishing Co.
Charnes, A., W.W. Cooper, and A. Henderson. 1953. An Introduction to Linear
Programming. New York: John Wiley & Sons.
NOTES

194 Self-Instructional Material

Probability Distribution

UNIT 64PROBABILITY
DISTRIBUTION
NOTES
Structure
6.0 Introduction
6.1 Unit Objectives
6.2 Classical Approch to Probability
6.2.1 Sample Space
6.3 Random Variables
6.3.1 Types of Random Variables
6.3.2 Joint and Marginal Probability Density Function
6.4 Discrete Theoretical Distributions
6.4.1 Binomial Distribution
6.4.2 Moments
6.4.3 Poisson Distribution with Mean and Variance
6.4.4 Uniform Distribution
6.5 Summary
6.6 Key Terms
6.7 Answers to ‘Check Your Progress’
6.8 Questions and Exercises
6.9 Further Reading

6.0 INTRODUCTION

In this unit, you will learn about probability distribution. Probability distribution
refers to mathematical models of relative frequencies of a finite number of
observations of a variable by listing all the likely results of an experiment. It is a
symmetric arrangement of probabilities associated with mutually exclusive and
collectively exhaustive elementary events in an experiment.
This unit will also walk you through random variables and discrete theoretical
distributions. These are supported by a number of examples to enhance
understanding of the subject clearly.

6.1 UNIT OBJECTIVES

After going through this unit, you will be able to:

• Understand random variables
• Classify the types of random variables
• Explain moment generating function
• Discuss discrete theoretical distribution

Self-Instructional Material 195

Probability Distribution
6.2 CLASSICAL APPROCH TO PROBABILITY

The classical theory of probability is the theory based on the number of favourable
NOTES outcomes and the number of total outcomes. The probability is expressed as a
ratio of these two numbers. The term ‘favorable’ is not the subjective value given
to the outcomes, but is rather the classical terminology used to indicate that an
outcome belongs to a given event of interest.
Classical definition of probability: If the number of outcomes belonging
to an event E is NE , and the total number of outcomes is N, then the probability of
NE
event E is defined as pE =
N
For example, a standard pack of cards (without jokers) has 52 cards. If we
randomly draw a card from the pack, we can imagine about each card as a possible
outcome. Therefore, there are 52 total outcomes. Calculating all the outcome
events and their probabilities, we have the following possibilities:
• Out of the 52 cards, there are 13 clubs. Therefore, if the event of interest is
drawing a club, there are 13 favourable outcomes, and the probability of
13 1
this event becomes: = .
52 4
• There are 4 kings (one of each suit). The probability of drawing a king is:
4 1
= .
52 13
• What is the probability of drawing a king or a club? This example is slightly
more complicated. We cannot simply add together the number of outcomes
for each event separately (4 + 13 = 17) as this inadvertently counts one of
16
the outcomes twice (the king of clubs). The correct answer is: from
52
13 4 1
+ −
52 52 52
We have this from the probability equation, P(club) + P(king) – P(king of
clubs).
• Classical probability has limitations, because this definition of probability
implicitly defines all outcomes to be equiprobable and this can be only used
for conditions such as drawing cards, rolling dice, or pulling balls from urns.
We cannot calculate the probability where the outcomes are unequal
probabilities.
It is not that the classical theory of probability is not useful because of the
above described limitations. We can use this as an important guiding factor to

196 Self-Instructional Material

calculate the probability of uncertain situations as mentioned above and to calculate Probability Distribution

the axiomatic approach to probability.

Frequency of Occurrence
This approach to probability is widely used to a wide range of scientific disciplines. NOTES
It is based on the idea that the underlying probability of an event can be measured
by repeated trials.
Probability as a measure of frequency: Let nA be the number of times
event A occurs after n trials. We define the probability of event A as,
nA
P ( A ) = Lim
n →∞ n
It is not possible to conduct an infinite number of trials. However, it usually
suffices to conduct a large number of trials, where the standard of large depends
on the probability being measured and how accurate a measurement we need.
Definition of Probability

nA
The sequence in the limit that will converge to the same result every time, or
n
that it will converge at all. To understand this, let us consider an experiment consisting
of flipping a coin an infinite number of times. We want that the probability of heads
must come up. The result may appear as the following sequence:

HTHHTTHHHHTTTTHHHHHHHHTTTTTTTTHHHHHHHHHHHHH
HHHTTTTTTTTTTTTTTTT...

This shows that each run of k heads and k tails are being followed by
nA
another run of the same probability. For this example, the sequence oscillates
n
1 2
between, and which does not converge. These sequences may be unlikely,,
3 3
and can be right. The definition given above does not express convergence in the
required way, but it shows some kind of convergence in probability. The problem
of formulating exactly can be considered using axiomatic probability theory.
Empirical Probability Theory
The empirical approach to determine probabilities relies on data from actual
experiments to determine approximate probabilities instead of the assumption of
equal likeliness. Probabilities in these experiments are defined as the ratio of the
frequency of the possibility of an event, f(E), to the number of trials in the experiment,
n, written symbolically as P(E) = f(E)/n. For example, while flipping a coin, the
empirical probability of heads is the number of heads divided by the total number
of flips.
Self-Instructional Material 197
Probability Distribution The relationship between these empirical probabilities and the theoretical
probabilities is suggested by the Law of Large Numbers. The law states that as
the number of trials of an experiment increases, the empirical probability approaches
the theoretical probability. Hence, if we roll a die a number of times, each number
NOTES would come up approximately 1/6 of the time. The study of empirical probabilities
is known as statistics.
6.2.1 Sample Space
A sample space is the collection of all possible events or outcomes of an experiment.
For example, there are two possible outcomes of a toss of a fair coin: a head and
a tail. Then, the sample space for this experiment denoted by S would be:
S = [H, T]
So that the probability of the sample space equals 1, or
P[S] = P[H,T] =1
This is so because in the toss of the coin, either a head or a tail, must occur.
Similarly, when we roll a die, any of the six faces can come as a result of the roll
since there are a total of six faces. Hence, the sample space is S = [1, 2, 3, 4, 5,
6], and P[S] = 1, since one of the six faces must occur.

6.3 RANDOM VARIABLES

Probability distribution refers to mathematical models of relative frequencies of a

finite number of observations of a variable. This is done by listing all the possible
outcomes of an experiment. It is a symmetric arrangement of probabilities associated
with mutually exclusive and collectively exhaustive elementary events in an
experiment. There are two categories of probability distribution based on the nature
of random variable. The distribution, which is based on discrete or finite random
variables is known as discrete or theoretical probability distribution, whereas the
distribution in which random variable is continuous is known as continuous
probability distribution. Here, some of these distributions will be discussed.
If the heights and weights of the students of a class are measured, you get a
pair of real number for each student. Outcomes in these cases are quantitative, i.e.
these can be measured or counted. These may be quantitative outcomes too. For
example, in case of throwing of a coin the outcome may be head or tail which is
not quantitative. But for convenience, qualitative outcomes may be expressed in
quantitative forms. For example, in tossing of a coin the outcome ‘head’ is denoted
by 1 and ‘tail’ by 0. In this way, each outcome of a random experiment, whether
it is qualitative or quantitative, can be expressed by a real number. This numerical
value associated with the outcome of a random experiment is called a random
variable.

198 Self-Instructional Material

In other words, the variable which can take certain values depending on Probability Distribution

chance is called random variable.

Random variable as a function: If for every outcome s of a sample space S
there is a real number denoted by X(s), X is called a function defined on S.
NOTES
Definition of random variable: A real valued function X defined on a sample
space of random experiment E, is called a random variable which assigns to each
sample point, one and only one real number X(s) where s ∈ S.
Or
If E is a random experiment and s is the outcome of a sample space S
associated with it, then a function X(s), where s ∈ S is called a random variable.
6.3.1 Types of Random Variables
There are two types of random variables which are as follows:
1. Discrete random variable
2. Continuous random variable
1. Discrete random variable: A random variable is said to be a discrete
random variable if it assumes only a finite or countable infinite number of
values of X.
In finite case, X may take values x1, x2, ..., xn and x1, x2, ... in countable
infinite case. The countable infinite case will have an infinite sequence of
distinct values, and that sequence will be countable.

For example: 1, 1 , 1 , ... 1 ... is a countable infinite sequence.

2 3 n
2. Continuous random variable: If a random variable assumes any value in
some interval or intervals it is called a continuous random variable. Also if a
variable can take an infinite set of values in the given interval, say a ≤ X ≤ b,
it is a continuous random variable and its distributions are accordingly known
as continuous distributions.
3
For example, Y = sin x (0 ≤ x ≤ π) is an example of continuous
2
distribution as X can assume all values lying between 0 and π.
Probability Function of a Discrete Random Variable: Let X be a random
variable which can take the values x1, x2, ... . The probability x = xi that is denoted
by P(X = xi) or p (xi) or Pi is called probability function of X.
Here P(xi) ≥ 0 and Σ P(xi) = 1
Probability Distribution of a Discrete Random Variable: The probability
distribution of a discrete random variable is a set of order pairs [xi, P(xi)].

Self-Instructional Material 199

Probability Distribution (i) P(xi) ≥ 0 ; i = 1, 2, 3, ...
(ii) Σ P(xi) = 1
In other words, the distribution obtained by taking the possible values of a
NOTES random variable together with their respective probabilities is called probability
distribution.
Mathematical Expectation: Let X be a discrete random variable which takes
the values x1, x2, ... xn with respective probabilities P1, P2, ... Pn such that P1 +
P2 + ... + Pn = 1.
The mathematical expectation of X or the expected value of X denoted by
E(X) is defined as
n
E(X) = P1x1 + P2x2 + ... + Pnxn = ∑ Pi xi
i =1
or E(X) = ΣxP(x)
Theorem 6.1 : The expectation of the sum of the two random variables, is equal
to the sum of their expectation, i.e.
E(X + Y) = E(X) + E(Y)
where X and Y are two random variables.
Proof: Let X be a random variable with values x1, x2, ... xn and probabilities P1,
P2, ... Pn. Let Y be another random variable with values y1, y2, ... yn and
probabilities P′1, P2′ ... Pm′.
Then the random variable (X + Y) can take mn values xi + yj (i = 1, 2, ... n
and j = 1, 2, ...m). Let Pij be the probability corresponding to the variable, when
X assumes the value xi and Y assumes the value yj.
Now if X assumes a definite value xi and Y assumes one of the values y1, y2,
... ym so that the sum,
n
∑ Pij = Pmi represents the probability
i =1

Pi of X assuming the value xi.

n
i.e. ∑ pij = Pi
j =1
n
Similarly ∑ Pij = P′j. then,
i =1
n m
E(X + Y) = ∑ ∑ Pij (xi + yj)
i =1 j =1

200 Self-Instructional Material

Probability Distribution
n m n m
= ∑ ∑ Pi j xi + ∑ ∑
j =1
Pi j y j
i =1 j =1 i =1
n m
NOTES
= ∑ xi Pi + ∑ P′ j y j
i =1 j =1

= E(X) + E( Y)
Thus, E(X + Y ) = E(X) + E( Y)
Corollary:E(X + Y + Z ... + ) = E(X) + E( Y) + E( Z) + ...
Notes: 1. Expected value of a constant is the constant itself.
i.e. E(C) = C where C is a constant
2. E(X) is also known as the mean of variable X.
3. If X is a random variable and f(X) is a function of X which takes the
values f(x1), f(x2), ... when X takes the values x1, x2, ... with probabilities P1, P2,
... respectively, then,
E(f(X)) = P1. f(x1) + P2. f(x2) + ... + Pn. f(xn)
and ΣPi = 1
Now if, f(X) = xr , then,
E(xr) = P1x1r + P2 x2r + ... + Pn x nr ... = ΣPi(x ri)
This is defined as the rth moment of the discrete probability distribution
about x = 0 and is denoted by µr′.
Also, µr = E(X – E(X))r
4. µ′ = E(X) = P1x1 + P2 x2 + ... + Pn x n = ΣPixi clearly.
5. The first moment about mean is zero. Let X be a random variable with
the following probability distribution,

X: x1 x2 ... xn
Y: p1 p2 ... pn

n
Then, E(X) = ΣXP(X) = ∑ xi Pi
i =1

and µ1 = E(X – E(X))

= E[X + (–E(X))] = E(X) + E[–E(X)].
= E(X) + E(–C) [Q – E(X) is constant]
= E(X) – E(X) = 0 [by E(C) = C]
and µ2 = E(X – E(X))2

Self-Instructional Material 201

Probability Distribution = E[X2 – 2XE(X) + {E(X)}2]
= E(X2) – 2[E(X)]2 + [E(X)]2
= E(X) – [E(X)]2
NOTES This is called variance of the variable X. denoted as Var (X).
Since Var (X) ≥ 0 ⇒ E(X2) – [E(X)]2 ≥ 0
⇒ E(X2) ≥ [E(X)]2
Theorem 6.2: The expected value of the product of two independent discrete
random variables is equal to the product of their expectation, i.e.
E(XY) = E(X). E(Y).
Notes: 1. Let X and Y be two discrete random variables. If X takes values x1, x2,
... xn with probability P1, P2, ... Pn, and Y takes values Y1, Y2, ...Yn with probability
P′1, P′2, ... P′m then the function
P(X = xi, Y = yj) i = 1, 2, ... n and j = 1, 2, ... m.
denoted by P(xi, yj) or Pij is called the joint probability of X and Y.
If P(X = xi, Y = yj) = P(X = xi). (Y = yi)
i.e. Pij = Pi.P′j.
Then the random variables X and Y are said to be independent random
variables.
2. E(XYZ ...) = E(X). E(Y). E(Z)...
where, X, Y, Z, ... are independent random variables.
Covariance: If E(X) and E(Y) denote the expected value (or mean) of the two
random variables X and Y, then the covariance between X and Y is denoted as
Cov (X, Y) and defined as,
Cov(X, Y) = E[(X – E(X)) . (Y – E(Y))]
= E(XY) – E(X). E(Y).
Note: 1. The covariance of two independent random variables is equal to zero.
i.e. Cov(X, Y) = E(XY) – E(X). E(Y) = 0
if X and Y are independent random variables.
Proof: We know that Cov(X. Y) = E(XY) – E(X). E(Y), where X and Y are
independent random variables.
We know that E(X. Y) = E(X) . E(Y). If X and Y are independent random
variables then, Cov(X, Y) = E(X). E(Y) – E(X). E(Y) = 0.
Correlation coefficient: For two random variables X and Y, the correlation
coefficient ρXY or r(X, Y) is defined as
Cov( X , Y )
r(X, Y) = ρXY =
Var ( X ), Var (Y )
202 Self-Instructional Material
Probability Distribution
Cov( X , Y )
=
σ X . σY

where, σX = var( X ) and σY = var(Y ) are the standard deviations of variables NOTES
X and Y respectively.
Probability density function of continuous random variable: Consider a small
interval (x, x + dx) of length dx round the point x. Let f(x) be any continuous
function of x so that f(x) dx represents the probability that X falls in the infinitesimal
interval (x, x + dx). Then f(x) is called the probability density function of X. The
continuous curve y = f(x) is called the probability density curve.
Symbolically,
P(x ≤ X ≤ x + dx) = fX(x) or f(x).
The probability density function PDF of X, i.e. f(x) or fX(x) has the following
properties —
(i) f(x) ≥ 0 for all x
∞
(ii) ∫ f ( x)dx = 1
−∞

Distribution function: Let X be a random variable with PDF f(x), then the
function
∞
FX(x) = P(X ≤ x) = ∫ f (t )dt , – ∞ < x < ∞.
−∞

is called the distribution function or sometimes commutative distribution function

of the random variable X.
Note: 1. 0 ≤ F(x) ≤ 1 ; – ∞ < x < ∞.
2. F(x) is a non-decreasing function of X.
−∞
3. F(– ∞) = lim F ( x) =
x →−∞
∫ f ( x)dx = 0
−∞
−∞
∞
and F(+ ∞) = lim F ( x) =
x →+∞
∫ f ( x)dx = 1 .
−∞
4. F(x) is a continuous function of X on the right
b b a
5. P(a ≤ X ≤ b) = ∫ f ( x)dx = ∫ f ( x)dx = ∫ f ( x)dx
a −∞ −∞

Self-Instructional Material 203

Probability Distribution = P(X ≤ b) – P(X ≤ a).
= F(b) – F(a)
d
6. Since F′(x) = F(x), we have, F(x) = F(x)
NOTES dx
⇒ dF(x) = F(x)dx
This is known as probability differential of X.
6.3.2 Joint and Marginal Probability Density Function
Let X and Y be two random variables with probability density functions fX(x) and
fY ( y). Probability that a point (x, y) lies in the infinitesimal rectangular region of
area dxdy is given by f(x, y) dxdy
 dx dx dy dy 
i.e. P x − <X ≤x+ ,y − < Y ≤ y +  = f(x, y) dxdy.
 2 2 2 2 
then f(x, y) is called the joint PDF of X and Y.
∞ ∞
Where, ∫ ∫ f ( x, y ) dxdy = 1, – ∞ < x < ∞, – ∞ < y < ∞.
−∞ −∞

The marginal probability density functions of X and Y are given respectively.

∞
h( y) = fY (y) = ∫ f XY ( x, y ) dx (For continuous random variable)
−∞

= ∑ PXY ( x, y) (For discrete random variable)

x
∞
and g(x) = FX(x) = ∫ f XY ( x, y ) dy (For continuous random variable)
−∞

= ∑ PXY ( x, y) (For discrete random variable)

If fXY(x, y) = g(x). h(y) then X and Y are said to be independent random

variable
Expected value of a continuous random variable: Let X be a continuous
random variable with PDF f(x) in (– ∞,∞ ). Then expectation of X is defined as
∞
E(X) = ∫ xf ( x) dx .
−∞
E(X) is also called mean of the probability distribution defined by f(x).

204 Self-Instructional Material

Important Results: Probability Distribution

(i) E(C) = C (ii) E(aX) = aE(X).

(iii) E(X + Y) = E(X) + E(Y) (vi) E(X. Y) = E(X). E(Y)
where, X and Y are NOTES
independent random variables
(v) Var (X ± Y) = Var (X) + Var(Y) ± 2 Cov(X, Y)
Note:1. If g(x) is a continuous function of x. Then,
∞
E(g(X)) = ∫ g ( x). f ( x)dx
−∞
and if g(x) = X r, then
∞

∫ x r f ( x)dx
r
E(x ) =
−∞
which is defined as µ′r , the rth moment (about origin) of the probability
distribution and if g(X) = [X – E(X)]r = [X – x ]r
∞

∫ (X − x)
r r
then, E(g(x)) = E[X – E(X)] = f ( x)dx
−∞
Which is denoted by µr, the rth moment about mean x .
Moment generating function: Let X be a random variable. Then moment
generating function M(t) is defined as
M(t) = E(etx)
t2 X 2
= E(1 + tX + + ...)
2!
t2 tr
= 1 + tE ( X ) + E ( X 2 ) + ... + E ( X r ) + ...
2! r!

t2 tr
= 1 + t. µ′1 + µ′2 + ... + µ′r
2! r!
tr
the coefficient of in the expression is the rth moment of X about origin (i.e.
r!
µ′r ).
This shows that moments are generated by moment generating function.
For a discrete random variable X which takes the values x1, x2, ... with
probabilities P1, P2, ... the moment generating function is defined as,

Self-Instructional Material 205

Probability Distribution
n
M(t) = E(etx) = ∑ etx .Pi
i =1

and for a continuous random variable X with probability density function f(x),
NOTES
–∞<x<∞
∞

∫e
tx
M(t) = E(etx) = . f ( x)dx .
−∞

Various points about Moment Generating Function:

(i) M0(t) = E(etx) is called moment generating function about origin.
(ii) Ma(t) = E(et(x–a)) is called moment generating function about any point.
(iii) M x (t) = E(et(x – x )) is called moment generating function about mean.

t(x – x )
 t 2 ( X − x )2 
M x (t) = E(e ) = E 1 + t ( X − x ) + + ...
 2! 
t2 tr
= 1 + t. µ′1 + µ′2 + ... + µ′r + ...
2! r!
where µr is the rth moment about mean. It is also called the central
moment.
Characteristic function: Let X be a random variable. The function φ(t) is defined
as,
φ(t) = E(eitx)
is called the characteristic function of X.
Here, i is the imaginary quantity defined as −1 = i.
For a discrete random variable X with values x1, x2, ... and probabilities
P1, P2, ..., the characteristic function is defined as,
itx itx
φ(t) = E(eitx) = P1e 1 + P2e 2 + ...

= ∑ Pr eitxr
r

For a continuous random variable with PDF f(x) the characteristic function
φ(x) is defined as,
∞
φ(t) = E(e ) = itx
∫ f ( x). eitx dx
−∞

Notes:1. φ(t) is a complex valued function of real t.

2. φ(t) always exists.
206 Self-Instructional Material
3. φ(t) is a continuous function in t. Probability Distribution

4. φ(t) is uniquely determined by the distribution.

5. φ(0) = 1 and | φ(t) | ≤ 1.
6. φ(t) completely determines the distribution of X. NOTES
7. φ(t) generates moments.
8. For any constant k, φkx(t) = φx(kt).
9. φ x1 + x2 + ... + xn (t ) = φ x1 (t ). φ x2 (t ) ... φ xn (t ) .
10. φ(t) and φ(– t) are conjugate functions.
Proof: φx(t) = E(eitx) = E(cos t x + i sin t x)
φx(t) = E(cos t x – i sin t x)
= E(cos (– t) x + i sin (– t) x)
= E(e–itx) = φX(– t)

Since a is the complex conjugate of a. φ x (t ) = φx(– t) is the complex

conjugate of φx(t).
Necessary and sufficient conditions for a function φ(t) to be characteristic
function: The necessary and sufficient conditions for a function φ(t) to be the
characteristic function of random variable X are
∞
(i) φ(0) = ∫ f (t )dt = 1.
−∞

(ii) | φ(t) | ≤ 1 = φ(0).

(iii) φ(t) is a continuous function.
(iv) φx(t) = φx(– t) are conjugate functions.
If a function φ(t) does not satisfy any one of these properties, it can not be
a characteristic function of a random variable X.
For example, if φ(t) = log (1 + t), then, φ(0) = log (1 + 0) = log 1 = 0 ≠ 1
cannot be a characteristic function of a random variable X.
The conditions stated above however, are not sufficient. It has been shown
that if φ(t) is near t = 0 of the form,
φ(t) = 1 + O(t2 + δ) ; δ > 0
where, O(tk) divided be t k tends to zero as t → 0 then φ(t) cannot be the
characteristic function unless it is identically equal to one.
1
For example, φ(t) = 2
= 1 + O(t2) is not a characteristic function,
1+ t
though it satisfies all the properties (or necessary conditions).
Self-Instructional Material 207
Probability Distribution A set of sufficient but not necessary conditions is given below:
φ(t) is a characteristic function if ,
(i) φ(0) = 1.
NOTES (ii) φ(t) = φ(–t) is continuous.
(iii) φ(t) is continuous.
(vi) lim φ(t ) = 0.
x →∞

(v) φ(t)is convex for t > 0.

i.e. for t1, t2, > 0,
1
2φ[ (t + t )] ≤ φ(t1) + φ(t2)
2 1 2
This set of sufficient but not necessary conditions are given by Polynomial.
Cramer defined the simplest conditions for a function as:
The necessary as well as sufficient conditions for a given bounded and
continuous function φ(t) to be a characteristic function are:
(i) φ(0) = 1
kk

∫ ∫ f (t − r )e
i (t − r ) x
(ii) φ(x, k) = dtdr.
00
is real and non-negative for all real x and all k > 0.
Example 6.1: What is the expected value of the number of points that will be
obtained in a signal throw with an ordinary die? Find the variance also.
Solution: The random variable X assumes the values 1, 2, 3, 4, 5 and 6 with
1
probability P(X = xi) = in each case.
6
1 1 1 1 1 1
so, E(X) = ∑ xi Pi = 1. + 2. + 3. + 4. + 5. + 6.
i
6 6 6 6 6 6
= 3.5
1 2
and Var (X) = E(X2) – [E(X)]2 = (1 + 22 + ... + 62 ) – (3.5)2
6
35
=
12
Example 6.2: A random variable X has the following probability function :

X: 0 1 2 3 4 5 6
p( X ) : 0 k 2k 2k 3k k2 2k 2

208 Self-Instructional Material

Find (i) k (ii) Evaluate p(X < 6), p(X ≥ 6) and p(0 < X < 4) and Probability Distribution

(iii) Determine the distribution function of X.

6
Solution: (i) Since ∑ p( x) = 1, we have,
NOTES
x =0

⇒ 0 + k + 2k + 2k + 3k + k2 + 2k2 = 1
⇒ 8k + 3k2 = 1

− 8 ± 82 + 4 × 3 − 4 ± 19
or, 3k2 + 8k – 1 = 0 ⇒ k= =
6 3
(ii) P(X < 6) = P(X = 0) + P(X = 1) + ... + P(X = 5)
= 0 + k + 2k + 2k + 3k + k2 = 8k + k2
= 8k + 3k2 – 2k2 = 1 – 2k2.
P(X ≥ 6) = 1 – P (X < 6) = 1 – (1 – 2k2) = 2k2
P(0 < X < 4) = P(X = 1) + .... + P(X = 3) = k + 2k + 2k = 5k
(iii) Distribution of X.

X f X ( x) + P ( X ≤ x)
0 0
1 k
2 k + 2k = 3k
3 k + 2 k + 2 k = 5k
4 k + 2k + 2k + 3k = 8k
5 k + 2k + 2k + 3k + k 2 = 8k + k 2
6 k + 2k + 2k + 3k + k 2 + 2k 2 = 8k + 3k 2 = 1

Example 6.3: Show that E(aX + bY) = aE(X) + bE(Y)

Solution: E(aX + bY) = E(aX) + E(bY) (as E(A + B) = E(A) + E(B))
= aE(X) + bE(Y) (as E(CX) = C.E(X))
Example 6.4: In three tosses of a coin, let x be the number of heads. Calculate
the expected value of x. Also determine the expectation of number of tails.
1
Solution: Let P(head) = P(tail) =
2
3
C3 1
P(x = 3) = =
8 8
3
C2 3
P(x = 2) = =
8 8

Self-Instructional Material 209

Probability Distribution 3
C1 3
P(x = 1) = =
8 8
3
C0 1
NOTES P(x = 0) = =
8 8
Hence, the probability distribution of x is

x=k: 0 1 2 3
p( x = k ) = p(k ) : 1/ 8 3/8 3/8 1/ 8
3
1 3 3 1 13
So, E(X) = ∑Px
k =0
k k = 0. + 1. + 2. + 3. =
8 8 8 8 8
let X′ be the number of tails., then.
3
C x′
P(X′ = x′) = , x′ = 0, 1, 2, 3.
8
13
and thus, E(X) = E(X′) = .
8
Example 6.5: A continuous random variable X has a PDF
f (x) = 3x2 0 ≤ x ≤ 1, find a and b such that
(i) P{X ≤ a} = P{X > a}, and
(ii) P{X > b} = 0.05
1
Solution: (i) Since P(X ≤ a) = P(X > a) each must be equal to , because the
2
total probability is always 1.
1 1
∴ P(X ≤ a) = and P(X > a) =
2 2
a a
1 1 1
⇒ ∫ f ( x)dx = = ∫ 3 x dx =
2
P(X ≤ a) =
2 0
2 0
2
a
 x3  1 1 1
⇒ 3  = ⇒ a3 – 0 = ⇒ a = 3
 3 0 2 2 2
1
(ii) p(X > b) = 0.05 ⇒ ∫ f ( x)dx = 0.05
b

 x 3 1
⇒  3  = 0.05 ⇒ 1 – b3 = 0.05
 3 b
1/ 3
19  19 
b = 3
⇒b=   .
20  20 

210 Self-Instructional Material

Probability Distribution

CHECK YOUR PROGRESS

1. What is sample space?
NOTES
2. What does probability distribution refer to?
3. Name the types of random variable.
4. What are the necessary and sufficient conditions for a function φ(t) to be
characteristic function?

6.4 DISCRETE THEORETICAL DISTRIBUTIONS

Theoretical Distributions: If a certain hypothesis is assumed, it is sometimes

possible to derive mathematically, what the frequency distributions of certain
universes should be. Such distributions are called theoretical distributions.
6.4.1 Binomial Distribution
Binomial distribution was discovered by James Bernoulli in the year 1700.
Let there be an event the probability of its success* is P and the probability
of its failure is Q is one trial, so P + Q = 1
Consider a set of n independent trials and the probability P of success is the
same in every trial, then Q = 1 – P is the probability of failure in any trial.
Let the set of n trials be repeated N times, where N is a very large number.
Out of these N, there will be sets with few success and also with number of
successes and so on.
Now the probability that the first k trials are successes and the remaining
(n – k) trial are failures is PkQ(n–k).
Since k can be chosen out of n-trials in nck ways, the probability of
k-successes, P(k) in a series of n-independent trials is given by,
P(k) = nck. Pk.Q(n – k)
The probability distribution of the number of successes, so obtained is called
the ‘Binomial probability distribution’.
The probabilities of 0, 1, 2, ... n successes are nC0P0Qn, nC1P1Qn–1,
...nCnPnQ0, are the successive terms of binomial expansion (Q + P)n.
Definition: A random variable X is said to follow binomial distribution if it assumes
only non-negative values and its probability mass function is given by,

* Let a random experiment be performed respectively and the occurrence of an

event in a trial is called a success and its non-occurrence a failure.
Self-Instructional Material 211
Probability Distribution
 n ck .P k .Q ( n− k ) ; k = 0, 1, 2, ... Q = 1 − P
P(X = k) = P(k) = 
 0 ; otherwise

NOTES Usually we use the notation X ~ B(n, P) to denote that the random variable X
follows binomial distribution with parameters n and P.
Notes: 1. Since n and P are independent constants in the binomial distribution ,
they are called the parameters of distribution.
n n
2. ∑ P (k ) = ∑ n Ck .Pk .Qn−k = (Q + P) n
=1
k =0 k =0

3. If the experiment consisting of n-trials is repeated n times, then the

frequency function of the binomial distribution is given by,
f (k) = N.P(k) = N. [nCk.Pk.Q(n–k)]; k = 0, 1, 2, ...
and the expected frequencies of 0, 1, 2, ...n successes are the successive terms of
the binomial expansion, N(Q + P)n; (Q + P = 1)
6.4.2 Moments
The first four moments about origin of binomial distribution are obtained as follows:
(i) Mean or First moment about origin
n
µ′1 = E(X) = ∑ x.n Cx .P x . Qn− x
x =0
n
= nP ∑
n −1
Cx −1.P x −1. Q n− x
x =1

= nP[Qn–1 + n–1C1P.Qn–2 + ... + Pn–1]

= nP(Q + P)n–1 = nP as P + Q = 1
so, mean or µ′1 = nP.

(ii) Second moment about origin

n
2
µ′2= E(X ) = ∑ x2 .n Cx .P x .Qn− x
k =0
n
= ∑ ( x + x( x − 1))n Cx .P x .Qn− x 
x =0

n n
= ∑ x. Cx .P n x
.Q n− x
+ ∑ x( x − 1).n Cx .P x .Qn− x
x =0 x =0
n
= nP + n(n – 1)P . ∑
n−2
2 Cx −2 .P x −2 . Q n− x
212 Self-Instructional Material
x=2
= nP + n(n – 1)P2.(Q – P)n–2 Probability Distribution

µ′2 = nPQ + n2P2.

(iii) Third moment about Origin
n NOTES
µ′3= E(X 3) = ∑ x3.n Cx .P x .Qn− x
k =0
on simplifying, we get,
µ′3 = n(n – 1)(n – 2)P3 + 3n(n – 1)P2 + nP.
(iv) Fourth moment about origin
n
4
µ′4= E(X ) = ∑ x4 .n Cx .P x .Qn− x
k =0
and on simplifying, we get,
µ′4 = n(n – 1) (n – 2) (n – 3)P4 + 6n(n – 1)(n – 2)P3
+ 7n(n – 1)P2 + nP.
Now, the moment about Mean of binomial distribution will be discussed.
(i) First moment about Mean
µ0 = 0 (always)
(ii) Second moment about Mean
µ2 = µ′2 – (µ′1)2 = E(X2) – [E(x)]2
= nPQ + n2P2 – n2P2 = nPQ
Thus, Var (X) = µ2 = nPQ and standard deviation σ(X) = nPQ
(iii) Third and fourth moment about Mean
(a) µ3 = µ′3 – 3µ′2µ′1 + 2(µ′1)3
= nPQ (Q – P) (On simplification)
(b) µ4 = µ′4 – 4µ′3µ′1 + 6µ′2(µ′1)2 – 2(µ′1)4
= 3n2P2Q2 – 6nP2Q2 + nPQ
= nPQ[1 + 3(n – 2)PQ]

µ 23 n 2 P 2Q 2 (Q − P)2 (1 − 2 P)2
Hence, β1 = = =
µ32 n3 P3Q3 nPQ

µ4 nPQ[1 + 3(n − 2) PQ] 1 − 6 PQ

β2 = = =3+
µ 22 n 2 P 2Q 2 nPQ

Q−P 1 − 6PQ
γ1 = β1 = and γ2 = β2 – 3 =
nPQ nPQ

Self-Instructional Material 213

Probability Distribution Moment generating function of binomial distribution: Let x have a binomial
distribution with probability function,
P(x) = nCx PxQn–x ; x = 0, 1, 2, ... n
The moment generating function about the origin is given as
NOTES n
M0(t) = E(etx) = ∑e
x=0
. Cx P x .Q n – x
tx n

= ∑
x=0
n
Cx ( Pet ) x .Q n – x

M0(t) = (Q + Pet)n

Moment generating function about mean nP is given by,

MnP(t) = E[et(x–nP)] = E(etx.e–nPt)
= e–nPt.E(etx) = e–nPt.M0(t)
= e–nPt(Q + et.P)n
= (Q.e–Pt + P.e(1 – P)t)n
MnP(t) = (Q.e–Pt + P.eQt)n

6.4.3 Poisson Distribution with Mean and Variance

The Poisson distribution is a limiting case of binomial distribution when the probability
of success or failure (i.e. P or Q) is vary small and the number of trial n is very
large (i.e. n → ∞) enough so that nP is a finite constant say λ i.e. nP = λ. Under
these conditions, P(x) the probability of x success in the binomial distribution,
P(x) = P(X – x) = nCx PxQn–x can be written as
x n− x
n! λ  λ
P(x) = .  . 1 − 
x !.(n − x)!  n   n
n
λx  λ  n!
= 1 − 
x!  n x  λ
x
n (n − x)! 1 − 
 n
Using Lim n → ∞ , we have,
n
λx  λ n!
P(x) = L im  1 −  L im ...(6.1)
x ! n→∞  n  n→∞  λ
x
(n − x)! 1 −  .n x
 n
n x
 λ  λ
We know that Lim  1 −  = e–λ and Lim  1 −  = 1
n→∞  n n→∞  n

214 Self-Instructional Material

and using Stirling’s formula for n!, we have, Probability Distribution

n! = 2π. n n+1/ 2 e− n

n! 2π.n n+1/ 2 .e− n NOTES

=
(n − x )! 2π.(n − x) n− x +1/ 2 .e− n + x

n! n n+1/ 2 .e− n
so, =
(n − x )! (n − x) n− x +1/ 2 .e− n+ x .n x
Thus, Equation (6.1) becomes,

λ x −λ n n+1/ 2 .e − n
P(x) = .e .L im
x! n→∞ ( n − x ) n − x +1/ 2 .e − n + x .n x

λ x .e −λ n n+1/ 2 .e− n
= .L im
x !.e x n →∞ (n − x) n− x+1/ 2 .e − n .n x

λ x .e−λ n n − x+1/ 2
= .L im 1
x !.e x n→∞ n− x+
n − x +1/ 2  x 2
n 1 − 
 n

λ x .e−λ  1 
= x
.L im  1 
x !.e n→∞
 x 2 
n − x+
x 
 1 −  . 1 −  
 n  n 

λ x .e−λ 1 λ x .e −λ
= . =
x !.e x e− x .1 x!
Thus,
λ x .e−λ
P(X = x) = P(x) = ; x = 0, 1, 2, ...
x!
When n → ∞, nP = λ and P → 0
Here, λ is known as the parameter of Poisson distribution.
Definition: A random variable X is said to follow a Poisson distribution if
it assumes only non-negative values and its probability mass function is
given by
λ x .e−λ
P(x, λ) = P(X = x) = ; x = 0, 1, 2, ...; λ > 0.
x!

Self-Instructional Material 215

Probability Distribution Constants of the Poisson distribution:
(i) Mean, µ′1 = λ = E(X).
(ii) µ′2 = E(X2) = λ + λ2.
NOTES (iii) µ′3 = E(X3) = λ3 + 3λ2 + λ.
(iv) µ′4 = E(X4) = λ4 + 6λ3 + 7λ2 + λ.
(v) First moment about mean µ1 = 0.
(vi) V(X) = µ2 = E(X2) – [E(X)]2 = µ′2 – µ′12
= λ + λ2 – λ2 = λ
Note that, Mean = Variance = λ.
(vii) Standard deviation σ(X) = λ .
(viii) µ3, i.e. the third moment about mean
µ3 = µ′3 – 3µ′2 µ′1 + 2µ′13
= (λ3 + 3λ2 + λ) – 3λ(λ2 + λ) + 2λ3 = λ.
(ix) µ4 = µ′4 – 4µ′3µ′1 + 6µ′2µ′12 – 3µ′14 = 3λ2 + λ.
Thus, coefficients of skewness and kurtosis are given by,
µ 2 λ2 1 1
β 1 = 3 3 = 3 = and γ1 = β1 =
µ2 λ λ λ
µ 1 1
Also, β 2 = 42 = 3 + and γ2 = β2 – 3 =
µ2 λ λ
Hence, the Poisson distribution is always a skewed distribution if Lim , we get,
λ→∞
β 1 = 0, β2 = 3.
Negative binomial distribution: Suppose there are n-trials of an event. We
assume that
(i) The n-trials are independent.
(ii) P is the probability of success which remains constant from trial to trial.
Let f (x; r, P) denote the probability that there are x failures preceeding the
rth success in (x + r) trials.
Now, in (x + r) trials the last trial must be a success with probability P. Then
in the remaining (x + r – 1) trials, we must have (r – 1) successes whose probability
is given by
x+r–1
Cr–1. Pr–1. Qx.
Therefore, the compound probability theorem f(x; r, P) is given by the
product of these two probabilities.
so f(x; r, P) = Cr–i. Pr–1. Qx.P
x+r–1

= x+r–1Cr–i. Pr. Qx.

216 Self-Instructional Material
Moment generating function of Poisson distribution Probability Distribution

λx
Let P(X = x) = e −λ. ; x = 0, 1, 2, ... ∞; λ > 0 be a Poisson distribution. Then
x!
the moment generating function is given by, NOTES
∞ −λ x
e .λ
M(t) = E(etx) = ∑ etx . x!
x =0
∞ t x
(λ. e )
= e −λ . ∑
x =0 x!
t t
= e−λ . eλe = eλ (e − 1)

and moment generating function about mean is,

Mλ(t) = E(et(x–λ)) = e–λt. E(etx)
= e–λt.eλ(et –1)
t
so, Mλ(t) = e( −λt + λe − λ)

Definition: A random variable X is said to follow a negative binomial distribution,

if its probability mass function is given by,
P(X = x) = P(x) = x+r–1Cr–1. Pr. Qx; x = 0, 1, 2, ...
0; otherwise.
Also, we know that nCr = nCn–r
so x+r–1
C r–1 = x+r–1
Cx
( x + r − 1) ( x + r − 2) ... (r + 1).r
=
x!
(−1) x (−r ) (−r − 1) ... (−r + x + 2) (−r + x + 1)
= = (–1)x. – rCx
x!

 − r Cr . p r (−q) x ; x = 0, 1, 2, ...
so p(x) = 
 0 ; otherwise

which is the (x + 1)th term in the expansion of Pr(1 – Q)–r, a binomial expansion
with a negative index. Hence, the distribution is called a negative binomial
distribution. Also,
∞ ∞
∑ P( x) = Pr ∑ −r
C x ( −Q ) x = Pr × (1 – Q)–r = 1
x =0 x =0

Therefore, P(x) represents the probability function and the discrete variable
which follows this probability function is called the negative binomial variable.
Example 6.6: A continuous random variable X has a probability distribution function
f(x) = 3x2, 0 ≤ x ≤ 1.
Find a and b such that

Self-Instructional Material 217

Probability Distribution (i) p{X ≤ a} = p{X > a}, and
(ii) p{X > b} = 0.05
Solution: (i) Since P{X ≤ a} = P{X > a}
NOTES 1
each must be equal to , because total probability is always 1.
2
a
1 1
∴ P{X ≤ a} = ⇒ ∫ f ( x)dx =
2 0
2
a
a
1  x3  1
or, 3 ∫ x 2 dx = ⇒ 3  =
0
2  3 0 2
1/ 3
a =  1 
1
or, a3 = ⇒
2 2

(ii) p{X < b} = 0.05.

1 1
⇒ ∫ f ( x)dx = 0.05 ⇒ 3∫ x 2 dx = 0.05
b b

x 3 1
1
⇒ 3   = 0.05 ⇒ 1 – b3 =
 3 b 20
1/ 3
b = 
19 

 20 

Example 6.7: A probability curve y = f(x) has a range from 0 to ∞. If

f(x) = e–x find the mean and variance and the third moment about mean.
Solution: We know that, the rth moment about origin,
∞ ∞
µ′r = ∫ xr f ( x)dx = ∫ x .e
r −x
dx
0 0

= Γ(r + 1) = r! (Using Gamma Integral)

Substituting, r = 1, 2, and 3, we have,
Mean, µ′1 = 1! = 1
µ′2 = 2! = 2 and µ′3 = 3! = 6
Thus, variance = µ2 = µ′2 – (µ′1)2 = 2 – 1 = 1
and µ3 = µ′3 – 3µ′2. µ′1 + 2(µ′1)3 = 6 – 3 × 2 + 2 = 2 is the
required third moment about mean.
6.4.4 Uniform Distribution
A random variable X is said to have a continuous uniform distribution over an
interval (a, b) if its probability density function is constant say k, over the entire
range of X.
218 Self-Instructional Material
k ; a < X < b
Probability Distribution

That is, f(x) = 

0; otherwise
Since total probability is always unity, we have,
b NOTES
∫a dx = 1
b
∫a f ( x ) dx = 1 ⇒ k=
1
or, k =
b−a
 1
 ; a< X <b
Thus, f(x) =  b − a
 0 ; otherwise
This is also known as rectangular distribution as the curve y = f(x) describes
a rectangle over the x-axis and between the ordinates at x = a and x = b.
The distribution function F(x) is given by,
0 ; if − ∞ < x < a
 x−a

F(x) =  b − a ; a ≤ x ≤ b

 1 ; b < x < ∞
Since F(x) is not continuous at x = a and x = b, it is not differentiable at
d 1
these points. Thus, F ( x) = f ( x) = ≠ 0, exists everywhere except at
dx b−a
the points x = a and x = b and consequently probability distribution function f(x)
is given by,
 1
 ; a< x<b
f(x) =  b − a
 0 ; otherwise

Figure 6.1 Uniform Distribution

Moments of uniform distribution:

a r 1  b r +1 − a r +1 
µ 'r = ∫
b
x f ( x) dx = 
(b − a )  r + 1 
.

Self-Instructional Material 219

Probability Distribution In particular,
1  b2 − a2   b + a 
mean, µ '1 =  = 
b − a  2   2 
NOTES 1  b3 − a 3  1 2 2
and µ '2 =   = (b + ab + a )
b−a  3  3
∴ Variance = µ 2 = µ '2 − (µ '1 )2
2
1 2 2 b + a
= (b + ab + a ) −  
3  2 
1
= (b − a ) 2
12
Moment generating function is given by,
b t ebt − e at
MX(t) = ∫
a
e f ( x ) dx =
t (b − a )
and the characteristic function is given by,
b itx eibt − eiat
φ x (t ) = ∫
a
e f ( x) dx =
t (b − a )

CHECK YOUR PROGRESS

5. Who discovered binomial distribution?
6. When is a random variable X said to follow binomial distribution?
7. What is Poisson distribution?

6.5 SUMMARY

In this unit, you have learned that:

• Probability distribution refers to mathematical models of relative frequencies
of a finite number of observations of a variable. It is a symmetric arrangement
of probabilities associated with mutually exclusive and collectively exhaustive
elementary events in an experiment.
• There are two categories of probability distribution based on the nature of
random variable. The distribution which is based on discrete or finite random
variables is known as discrete or theoretical probability distribution, whereas
the distributions in which random variable is continuous is known as
continuous probability distribution.
• Discrete random variable and continuous random variable are the two types
of random variables. A random variable is said to be discrete random variable
if it assumes only a finite or countable infinite number of values of X.
220 Self-Instructional Material
• If a random variable assumes any value in some interval or intervals it is Probability Distribution

called a continuous random variable.

• The probability distribution of a discrete random variable is the set of order
pairs [xi, p(xi)].
NOTES
• If X and Y are two random variables with probability density functions
fX(x) and fY(y). Probability that a point (x, y) lies in the infinitesimal
rectangular region of area dxdy is given by f(x, y) dxdy.
• The Poisson distribution is a limiting case of binomial distribution when the
probability of success or failure (i.e., P or Q) is vary small and the number
of trial n is very large (i.e., n → ∞) enough so that nP is a finite constant say
λ. i.e. nP = λ.
• A random variable X is said to have a continuous uniform distribution over
an interval (a, b) if its probability density function is constant say k, over the
entire range of X.

6.6 KEY TERMS

• Random variable: If E is a random experiment and s is the outcome of a

sample space S associated with it, then a function X(s) is called a random
variable.
• Discrete random variable: It is a type of random variable which assumes
only a finite or countable infinite number of values of X.
• Poisson distribution: It is a limiting case of binomial distribution when the
probability of success or failure is very small and the number of trial n is
very large.

6.7 ANSWERS TO ‘CHECK YOUR PROGRESS’

1. A sample space is the collection of all possible events or outcomes of an

experiment
2. Probability distribution refers to mathematical models of relative frequencies
of a finite number of observations of a variable. This is done by listing all the
possible outcomes of an experiment.
3. There are two types of random variable namely, discrete random variable
and continuous random variable.
4. The necessary and sufficient conditions for a function φ(t) to be the
characteristic function of random variable X are
∞
(i) φ(0) = ∫ f (t )dt = 1.
−∞
(ii) | φ(t) | ≤ 1 = φ(0).
Self-Instructional Material 221
Probability Distribution (iii) φ(t) is a continuous function.
(iv) φx(t) = φx(– t) are conjugate functions.
5. Binomial distribution was discovered by James Bernoulli.
NOTES 6. A random variable X is said to follow binomial distribution if it assumes only
non-negative values and its probability mass function is given by,
 n ck .P k .Q ( n− k ) ; k = 0, 1, 2, ... Q = 1 − P
P(X = k) = P(k) = 
 0 ; otherwise
7. Poisson distribution is a limiting case of binomial distribution when the
probability of success or failure is very small and the number of trial n is
very large.

6.8 QUESTIONS AND EXERCISES

Short-Answer Questions
1. What is classical definition of probability?
2. What is discrete random variable?
3. Define the binomial distribution.
4. Write the types of random variables.2

Long-Answer Questions
1. Prove that the expectation of the sum of the two random variables, is equal
to the sum of their expectation, i.e.
E(X + Y) = E(X) + E(Y)
where X and Y are two random variables.
2. How would you define the correlation coefficient ρXY or r(X,Y) for two
random variables X and Y?
3. Write a short note on probability density function of continuous random
variable.
4. List the various points about Moment Generating Function.
5. What is the expected value of the number of points that will be obtained in
a signal throw with an ordinary die? Find the variance also.
6. Describe the types of random variables along with their probabilities and
mathematical expectation.
7. Discuss joint and marginal probability density function.
8. Obtain the first four moments about origin of binomial distribution.
9. Describe uniform distribution and moments of uniform distribution.

222 Self-Instructional Material

Probability Distribution
6.9 FURTHER READING

Sancheti, D.C. and V.K. Kapoor. Business Mathematics. New Delhi: Sultan
Chand & Sons. NOTES
Levin, Richard I. 1991. Statistics for Management. New Jersey: Prentice-Hall.
Chance, William A. 1969. Statistical Methods for Decision Making. Illinois:
Richard D Irwin.
Meyer, Paul L. 1970. Introductory Probability and Statistical Applications.
Massachusetts: Addison-Wesley Publishing Co.
Johnson, R.A., Probability and Statistics for Engineers, New Delhi: PHI
Trivedi, K.S., Probability and Statistics with Reliability, New Delhi: PHI, 1994
Levin, Richard. I., and David. S. Rubin. 1997. Statistics for Management, 7th
edition, New Jersey: Prentice Hall International.
Gupta, S.C., and V.K. Kapoor. 1997. Fundamentals of Mathematical Statistics.
9th Revised Edition, New Delhi: Sultan Chand & Sons.
Freud, J.E., and F.J. William. 1997. Elementary Business Statistics – The
Modern Approach. 3rd edition, New Jersey: Prentice Hall International.
Goon, A.M., M.K. Gupta, and B. Das Gupta. 1983. Fundamentals of Statistics.
Vols. I & II, Calcutta: The World Press Pvt. Ltd.
Hogg, Robert. V., Allen T. Craig and Joseph W. McKean. 2005. Introduction to
Mathematical Statistics. New Delhi: Pearson Education.

Self-Instructional Material 223

Approximation Theory

UNIT 7 APPROXIMATION THEORY

Structure NOTES
7.0 Introduction
7.1 Unit Objectives
7.2 Taylor’s Series Representation
7.3 Chebyshev Polynomials and Inequality
7.3.1 Chebychev Inequality
7.4 Distribution
7.4.1 Central Limit Theorem
7.5 Laws of Large Numbers
7.5.1 Weak Law of Large Numbers
7.5.2 Strong Law of Large Numbers
7.6 Normal Approximation
7.7 Summary
7.8 Key Terms
7.9 Answers to ‘Check Your Progress’
7.10 Questions and Exercises
7.11 Further Reading

7.0 INTRODUCTION

Approximation theory deals with two types of problems, (i) when a function is
given explicitly and you want to find a simpler type such as polynomial for
representation and (ii) the problem concerns fitting function for the given data and
finding the best function in a certain class that is used to represent the data.
The major aim of approximation of functions is to represent a function with
minimum error as this is a central problem in software development. As you know,
computers are essentially arithmetic devices, the most elaborate function they can
compute is rational function, a ratio of polynomials. There are two ways to
approximate a function using a polynomial:
(1) Using truncated Taylor’s series
(2) Using Chebyshev polynomials
In this unit, you will learn to approximate a function by using these two
techniques.
In addition, you will learn about the normal distribution of variables and
polynomials.

Self-Instructional Material 225

Approximation Theory
7.1 UNIT OBJECTIVES

After going through this unit, you will be able to:

NOTES • Compute a function using truncated Taylor series
• Approximate a function using Chebyshev’s polynomials
• Understand Chebychev’s inequality
• Explain the concept of normal distribution and Central Limit Theorem
• Understand the weak and strong laws of large numbers
• Use normal approximations

7.2 TAYLOR’S SERIES REPRESENTATION

Let f(x) be a function which has derivatives upto (n + 1)th in an interval [a, b].
This function may be expressed near x = x0 in [a, b] as follows:

( x − x0 ) 2
f(x) = f ( x0 ) + ( x − x0 ) f ′( x0 ) + f ′′( x0 ) + ...
2!

( x − x0 ) n n ( x − x0 ) n + 1 ( n + 1)
+ ... + f ( x0 ) + ·f (k )
n! ( n + 1)!

Where, f ′( x0 ), f ′′( x0 ) ... f n ( x0 ) are the derivatives of f(x) evaluated at x0.

( x − x0 ) n + 1 f n + 1 ( k )
The term is called the remainder term. Here, k is a
( n + 1)!
function of x and lies between x and x0, i.e. x ≤ k ≤ x0.
This remainder term gives the truncation error only if the first n-terms in
Taylor series are used to represent the function.

( x − x0 ) n + 1
Hence,the truncation error (T.E.) = f ((kn) + 1) .
( n + 1)!

| ( x − x0 )n + 1 |
≤ M
(n + 1)!

Where, M = max f (nk )+ 1 for x in [a, b].

226 Self-Instructional Material

Example 7.1 Give a Taylor series representation of f(x) = sin x and compute Approximation Theory

sin x correctly upto three significant digits.

Solution: The Taylor series representation of sin x is (when x0 = 0)

( x − x0 ) 2 NOTES
f(x) = f ( x0 ) + f ′( x0 ) ( x − x0 ) + f ′′( x0 ) + ...
2!
x2 x3
= sin (0) + cos (0) ( x ) – sin (0). – cos (0) + ...
2! 3!
Thus,
x3 x 5 x 7
sin x = x – + − + ...
3! 5! 7!

x3 x5 x7
or, sin x = x − + − + ... ...(7.1)
6 120 5040
Since sin x is required correctly upto three significant digits and the truncation
error after three terms of equation is as follows:
1
T.E. ≤ = 0.000198
5040
Thus, the series is truncated after three terms.

x3 x5
So, sin x = x − +
3! 5!
This is the required representation to obtain the value of sin x correctly upto
three significant digits.

7.3 CHEBYSHEV POLYNOMIALS AND INEQUALITY

The Chebyshev polynomials {Tn(x)} are orthogonal on (–1, 1) with respect to the
weight function w(x) = (1 – x2)–1/2. The Chebyshev polynomial is defined by the
following relation:
For x ∈ [–1, 1], define
Tn(x) = cos (n cos–1 (x)) for each n ≥ 0.
It is not obvious from this definition that Tn(x) is an nth degree polynomial in
x, but now it will be proved that it is. First note that
T0(x) = cos 0 = 1 and T1(x) = cos (cos–1(x)) = x.
For n ≥ 1, introduce the substitution θ = cos–1 x to change this
equation to

Self-Instructional Material 227

Approximation Theory
Tn(x) = Tn (θ( x)) ≡ Tn (θ) = cos(nθ), where θ ∈ | 0, π | .
A recurrence relation is derived by noting that:
Tn + 1 (θ) = cos( nθ + θ) = cos( nθ) cos θ − sin( nθ) sin θ
NOTES
and Tn − 1 (θ) = cos( nθ − θ) = cos( nθ) cos θ + sin( nθ) sin θ.
Adding these equations gives the following:
Tn + 1 (θ) + Tn − 1 (θ) = 2 cos( nθ) cos θ.

Returning to the variable x and solving for Tn + 1 ( x) you have, for each n ≥ 1,
Tn + 1 (θ) = 2cos(n arccos x). x − Tn − 1 ( x) = 2Tn ( x) x − Tn − 1 ( x).

Since T0 ( x) and T1 ( x) are both polynomials in x, Tn + 1 ( x) will be a

polynomial in x for each n.
[Chebyshev Polynomials] T0 ( x) = 1, T1 ( x) = x,
and, for n ≥ 1, Tn + 1 ( x) is the polynomial of degree n + 1 given by
Tn + 1 ( x) = 2 xTn ( x) − Tn − 1 ( x).

The recurrence relation implies that Tn(x) is a polynomial of degree n, and

it has leading coefficient 2n – 1, when n ≥ 1. The next three Chebyshev polynomials
therefore are as follows:
T2 ( x) = 2 xT1 ( x) − T0 ( x) = 2 x 2 − 1.
T3 ( x ) = 2 xT2 ( x) − T1 ( x) = 4 x3 − 3 x.
and T4 ( x) = 2 xT3 ( x) − T2 ( x) = 8 x 4 − 8 x 2 + 1.
The graphs of T1, T2 , T3 , and T4 are shown in Figure 7.1. Notice that each
of the graphs is symmetric to either the origin or the y-axis, and that each assumes
a maximum value of 1 and a minimum value of –1 on the interval [–1, 1].
Tn(x)
T1(x)
1
T3(x) T4(x)

x
–1 1

T2(x)
–1

Figure 7.1 Graph of Chebyshev Polynomials

228 Self-Instructional Material
Approximation Theory
The first few Chebyshev polynomials are summarized as follows:
T0 ( x) = 1
T1 ( x) = x
T2 ( x) = 2x2 – 1 NOTES
T3 ( x) = 4x3 – 3x
T4 ( x) = 8x4 – 8x2 + 1
T5 ( x ) = 16x5 – 20x3 + 5x
T6 ( x) = 32x6 – 48x4 + 18x2 – 1
Note: The coefficients of xn in Tn(x) is always 2n – 1.
By using the Chebyshev polynomials, you can express 1, x, x2, ... x6 as
follows:
1 = T0 ( x)

x = T1 ( x)

1
x2 = [T2 ( x ) + T0 ( x )]
2

1
x3 = [3T1 ( x ) + T3 ( x )]
4

1
x4 = [3T0 ( x) + 4T2 ( x ) + T4 ( x )]
8

1
x5 = [10T1 ( x ) + 5T3 ( x ) + T5 ( x )]
16

1
and x6 = [10T0 ( x ) + 15T2 ( x ) + 6T4 ( x ) + T6 ( x )]
32
These expressions are useful in the economization of power series.
Thus, after omitting T5(x) you have, sin x – 0.8802 T1 ( x) − 0.03906 T3 ( x)
substituting T1 ( x) = x and T3 ( x) = 4 x3 − 3 x you have,
sin x = 0.9974x – 0.1562x3
which gives sin x correctly upto three significant digits with only two terms for any
value of x.
7.3.1 Chebychev Inequality
If X is a random variable having µ as expected value and σ2 as finite variance,
then for a positive real number k

Self-Instructional Material 229

Approximation Theory
1
Pr (|X – µ| ≥ kα) ≤ .
k2
Only when, k > 1, we can get useful information. This is equivalent to
NOTES
σ2
Pr (|X – µ| ≥ α) ≤ 2 .
α
For example, when k = √2, at least half of these values lie in the open
interval (µ – √σ, σµ + √2σ).
The theorem provides loose bounds. However, bounds provided as per
Chebychev’s inequality cannot be improved upon. For example, when k > 1,
following example having σ = 1/k, meets such bounds exactly.

Pr(X = – 1) = 1/(2k2),
Pr(X = 0) = 1 – 2k2,
Pr(X = 1) = 1/(2k2),
Pr(|X – µ| ≥ kσ) = 1/k2,
Equality holds exactly in case of a linear transformation whereas inequality
holds for distribution which is non-linear transformation.
Theorem is useful as it applies to random variables for any distribution and
these bounds can be computed by knowing only mean and variance.
Any observation, howsoever accurate it may be, is never more than few
standard deviations away from the mean. Chebyshev’s inequality gives following
bounds that apply to all distributions in which it is possible to define standard
deviation.
At least:
• 50% of these values lie within standard deviations = Ö2
• 75% of the values lie within standard deviations = 2
• 89% lie within standard deviations = 3
• 94% lie within standard deviations = 4
• 96% lie within standard deviations = 5
• 97% lie within standard deviations = 6
Standard deviations are always taken from the mean.
Generally:
Minimum (1 – 1/k2) × 100% lie with standard deviations = k.
x3 x5
Example 7.2: Represent sin x = x − + ... by using Chebyshev polynomial
3! 5!
to obtain three significant digits accuracy in the computation of sin x.
Solution: Using Chebyshev polynomial, sin x is approximated as:

230 Self-Instructional Material

Approximation Theory
1 1
sin x = T1 ( x ) − [3T1 ( x ) + T3 ( x )] + [10T1 ( x) + 5T3 ( x) + T5 ( x )]
24 1920
~ 0.8802 T1 ( x) − 0.03906 T3 ( x) + 0.00052 T5 ( x)
As the coefficient of T5(x) is 0.00052 and as | T5(x) | ≤ 1 for all x, therefore NOTES
the truncation error if we omit the last term above still gives three significant digit
accuracy.
Example 7.3: Express the Taylor series expansion of

−x x 2 x3 x 4
e = 1− x + − + + ....
2! 3! 4!
in terms of Chebyshev polynomials.
Solution: The Chebyshev polynomial representation of

x 2 x3
e− x = 1 − x + − + ....
2! 3!

−x 1 1
as e = T0 ( x ) − T1 ( x ) + [T2 ( x ) + T0 ( x)] − [3T1 ( x) + T3 ( x )]
4 24

1 1
+ [3T0 ( x ) + 4T2 ( x ) + T4 ( x )] − [10T1 ( x ) + 5T3 ( x ) + T5 ( x)] + ...
195 1920
Thus,
e− x = 1.26606 T0 ( x) − 1.13021 T1 ( x) + 0.27148 T2 ( x)

−0.04427 T3 ( x) + 0.0054687 T4 ( x) −0.0005208 T5 ( x) + ... .

Now, if you expand T0 ( x), T1 ( x), T2 ( x), T3 ( x), T4 ( x) and T5 ( x) using
their polynomial equivalents and truncate after six terms, than you have,
e− x = 1.00045 – 1.000022x + 0.4991992x2 – 0.166488x3
+ 0.043794x4 – 0.008687x5
Comparing this representation with the Taylor series representation, you
observe that there is a slight difference in the coefficients of different powers of x.
The main advantage of this representation as a sum of Chebyshev polynomials is
that, for a given error bound, you can truncate the series with a smaller number of
terms compared to the Taylor series. Also, the error is more uniformly distributed
for various arguments. The possibility of a series with a lower number of terms is
called economization of power series. The maximum error in the six terms of
−x
Chebyshev representation of e is 0.00045 whereas the error in the six terms of
−x
Taylor series representation of e is 0.0014. Thus, you have to add one more
term in Taylor series to ensure that the error is less than that in the Chebyshev
approximation.
Self-Instructional Material 231
Approximation Theory

CHECK YOUR PROGRESS

1. What two types of problems does approximation theory deal with?
NOTES 2. Name the two ways to approximate a function using a polynomial.
3. Summarize the first few Chebyshev polynomials.

7.4 DISTRIBUTION

Binomial, Poisson, negative binomial and uniform distribution are some of the
discrete probability distributions. The random variables in these distributions assume
a finite or enumerably infinite number of values but in nature these are random
variables which take infinite number of values i.e. these variables can take any
value in an interval. Such variables and their probability distributions are known as
continuous probability distributions.
A random variable X is the said to be normally distributed if it has the following
probability density function:
2
1  x – µ
1 –  
2 σ 
f(x) = e , for – ∞ ≤ x ≤ ∞
σ 2π
where µ and σ > 0 are the parameters of distribution.
Normal Curve: A curve given by
2
1  x – µ
–  
2 σ 
yx = y0e
which is known as the normal curve when origin is taken at mean.

1 x2
–
2 σ2
then yx = y0e .

y0 yx
O

Figure 7.2 Normal Curve

Standard Normal Variate : A normal variate with mean zero and standard
deviation unity, is called a standard normal variate.
That is; if X is a standard normal variate then E(X) = 0 and V(X) = 1.
232 Self-Instructional Material
Then, X ~ N (0, 1) Approximation Theory

The moment generating function or MGF of a standard normal variate is

given as follows:
1 2 2 1 2 NOTES
µt + t σ t
MX(t) = e 2  = e2
µ = 0

σ =1

Frequently the exchange of variable in the integral:

∞
1 2
/ 2 σ2
σ 2π ∫ e – ( x – µ)
–∞

is used by introducing the following new variable:

X –µ
Z= ~ N (0, 1)
σ
This new random variable Z simplifies calculations of probabilities etc.
concerning normally distributed variates.
X –µ
Standard Normal Distribution: The distribution of a random variable Z =
σ
which is known as standard normal variate, is called the standard normal distribution
or unit normal distribution, where X has a normal distribution with mean µ end
variance σ2.
The density function of Z is given as follows:
1 2
1 –2Z
φ(Z) = e ,–∞<Z<∞
2π
1 2
t
with mean O variance one of MGF e2
. Normal distribution is the most frequently
used distribution in statistics. The importance of this distribution is highlighted by
central limit theorem, mathematical properties, such as the calculation of height,
weight, the blood pressure of normal individuals, heart diameter measurement,
etc. They all follow normal distribution if the number of observations is very large.
Normal distribution also has great importance in statistical inference theory.
Examples of Normal Distribution:
1. The height of men of matured age belonging to same race and living in
similar environments provide a normal frequency distribution.
2. The heights of trees of the same variety and age in the same locality would
confirm to the laws of normal curve.
3. The length of leaves of a tree form a normal frequency distribution. Though
Self-Instructional Material 233
Approximation Theory some of them are very short and some are long, yet they try to tend towards
their mean length.
Example 7.4: X has normal distribution with µ = 50 and σ2 = 25. Find out
NOTES (i) The approximate value of the probability density function for X = 50
(ii) The value of the distribution function for x = 50.
1 2 2
Solution: (i) f(x) = e – ( x – µ) / 2σ , – ∞ ≤ x ≤ ∞.
σ 2π
for X = 50, σ2 = 25, µ = 50, you have
1
f(x) = ≡ 0.08.
5 2π
Distribution function f(x)
x
1 – ( x – µ2 ) / 2σ2
= ∫ σ 2π e . dx
–∞
 x – µ
  1
 σ  2
1 –2Z x–µ
= ∫ 2π
e dZ , where Z =
σ
–∞
0 1 2
1 – 2Z
∴ F(50) = ∫ 2π
e . dZ = 0.5.
–∞

Example 7.5: If X is a normal variable with mean 8 and standard deviation 4,

find
(i) P[X ≤ 5]
(ii) P[5 ≤ X ≤ 10]

 X – µ 5−8 
Solution: (i) P[X ≤ 5] = P  ≤ 
 σ 4 
= P (Z ≤ – 0.75)
= P (Z ≥ 0.75)
0.5
[By Symmetry]
= 0.5 – P(0 ≤ Z ≤ 0.75)
[To use relevant table]
= 0.5 – 0.2734 [See Appendix for value of ‘2’]
= 0.2266.
 5.8 10 − 8 
(ii) P[5 ≤ X ≤ 10] = P  ≤Z ≤ 
 4 4 

234 Self-Instructional Material

= P(– 0.75 ≤ Z ≤ 0.5) Approximation Theory

= P(– 0.75 ≤ Z ≤ 0) + P(0 ≤ Z ≤ 0.5)

= P(– 0 ≤ Z ≤ 0.75) + P(0 ≤ Z ≤ 0.5)
= 0.2734 + 0.1915 [See Appendix] NOTES
= 0.4649.
Example 7.6: X is a normal variate with mean 30 and S.D. 5. Find
(i) P[26 ≤ X ≤ 40]
(ii) P[| X – 30 | > 5]
Solution: Here µ = 30, σ = 5.

X = 40
Z = – 0.8 Z = 0 Z=2
X –µ
(i) When X = 26, Z = = – 0.8
σ
X –µ
and for X = 40, Z = =2
σ
∴ P[26 ≤ X ≤ 40] = P[– 0.8 ≤ Z ≤ 2]
= P[0 ≤ Z ≤ 0.8] + P[0 ≤ Z ≤ 2]
= 0.2881 + 0.4772 = 0.7653
(ii) P[| X – 3 | > 5] = 1 – P[| X – 3 | ≤ 5]
P[| X – 3 | ≤ 5] = P[25 ≤ X ≤ 35]
 25 − 30 35 − 30 
= P ≤Z≤ 
 5 5 
= 2.P (0 ≤ Z ≤ 1) = 0.
= 2 × 0.3413 = 0.6826.
so P[| X – 3| > 5] = 1 – P[| X – 3 | ≤ 5]
= 1 – 0.6826 = 0.3174.
7.4.1 Central Limit Theorem
Let X 1, X 2,.... X n be n independent random variables all of which have the
same distribution. Let the common expectation and variance be µ and σ2
respectively.

Self-Instructional Material 235

Approximation Theory n
∑
Xi
Let X =
i =1 n

Then, the distribution of X approches the normal distribution with mean m

NOTES
σ2
and variance as n → ∞
n
X −µ
That is, the variate Z = has standard normal distribution.
σ/ n
Proof: Moment generating function of Z about origin is given as follows:
 t X – µ  
  σ / n  
MZ(t) = E(etZ) = E e  
 
– µt n /σ n /σ
=e E (et × )
n  t n  X1 + X 2 + ... + X n  
–µt  
=e σ .E e σ  n  
 
 

–µt
n  t
( X1 + X 2 + ... + X n ) 
σ 
.E e σ n 
=e  
 
n
–µt t
=e
σ . M ( X1 + X 2 + ... + X n ) .
σ n
n
  
n   
.  M x  
–µt t
σ
=e  t  
 σ   
 
  σ n   
This is because the random variables are independent and have the same
MGF by using logarithms, you have:

− µt n  t 
log Mz(t) = + n log M x  
σ σ n 

− µt n  µ′1t µ′2 t  i 
2 
= 
+ n log 1 + +   + ...
σ  σ n 2!  σ n  
 
236 Self-Instructional Material
Approximation Theory
− µt n  µ′ t µ′ t t 2  1  µ′1t 
2 
= + n  1
+ 2
. 2 + ...  –  + ...  + ...
σ  σ n 2! n σ  2 σ n
  

− µ t n µ′1 t n µ′2 t 2 µ′12 t 2 NOTES

= + + – + ...
σ σ 2σ 2 2σ 2

t2
= + O (n – 1/ 2 ) [Q µ2′ – µ1′2 = σ2µ1′ = µ]
2
Hence, as n → ∞

t2 2
log(Mz)(t) → i.e. M2(t) = et /2
2
However, this is the M.G.F. of a standard normal random variable. Thus,
the random variable Z converges to N.
This follows that the limiting distribution of x as normal with mean µ and
σ2
variance .
n

7.5 LAWS OF LARGE NUMBERS

Before starting the laws of large numbers, let us define an inequality named as
Kolmogorov’s inequality. The set of Kolmogorov’s inequalities was defined by
Kolmogorov in 1928.
Suppose X1, X2, ...., Xn is a set of independent random variables having
mean O and variances σ12, σ22, .... σn2.
Let C n 2 = σ12 + σ22 + .... + σn2.
Then the probability that all of the inequalities
 1 
| x1 + x2 + .... + xα| < λCn, α = 1, 2, ...n hold, is at least 1 – 2  .
 λ 
7.5.1 Weak Law of Large Numbers

x1 + x2 + .... + xn x + x2 + .... + xn
If you put, x = ,x= 1
n n
En
and λ = = α.
n2
where ∝ is an arbitrary positive number and En is the mathematical expectation of
the variate

Self-Instructional Material 237

Approximation Theory
2
u = ( x1 + x2 + ... + xn – x1 − x2 − ... − xn )
2
i.e. En = E [( x1 + x2 + ... + xn – x1 − x2 − ... − xn ) ]
NOTES  x + x + ... + xn x1 − x2 − ... − xn 
then, P 1 2 – < α
 n n 

En En
≥1− = 1 − η provided < η.
n2 α2 n α2
2

This is known as the Weak Law of Large Numbers. This can also be stated
as:
With the probability approaching unity or certainty as near as you please,
you can expect that the arithmetic mean of the values actaully assumed by n variates
will differ from the mean by their expectations by less than any given number
however small, provided the number of variates can be taken sufficiently large and
En
provided the condition → 0 as → ∞ is fulfilled.
n2
In other words, a sequence X1, X2, ... Xn of random variables is said to
satisfy the weak law of large numbers if

 Sn  Sn  
lim P  − E   < ε = 1
n→∞  n  n  

βn
for any ε > 0 where Sn = X1 + X2 + ... + Xn. The law holds provided → 0 as
n2
n → ∞ where Bn = Var. (Sn) < ∞.
7.5.2 Strong Law of Large Numbers
Consider the sequence X1, X2,....Xn of independent random variables with
expectation xk or µ k = E(Xk) and variance σ2.
If Sn = X1 + X2 + .... + Xn and E(Sn) = mn then it can be said that the sequence S1,
S2, .... Sn obeys the strong law of large numbers if every pair ε > 0, δ > 0
corresponds to N such that there is a probability (1 – δ) or better that for every r
> 0 all r + 1 inequalities,

Sn – mn
<ε
n
n = N, N + 1, ... N + r will be satisfied.

238 Self-Instructional Material

Example 7.7: Examine whether the weak law of large numbers holds for the Approximation Theory

sequence {Xk} of independent random variables defined as follows:

P(Xk = ± 2k) = 2–(2k + 1)
P(Xk = 0) = 1 – 2–2k NOTES
Solution: E(Xk) = ΣXk pk
= 2k × 2–(2k + 1) + (– 2k) × 2–(2k + 1) + 0 × (1 – 2–2k)
= 2–(2k + 1) [2k – 2k] = 0.
E(Xk2) = Σxk2 . pk
= (2k)2 × 2–(2k + 1) + (– 2k)2 × 2–(2k + 1) + 02 × (1 – 2–2k)
= 2–(2k + 1) [22k + 22k] = 2–1 + 2–1 = 1.
∴ Vari(Xk) = E(Xk2) [E(Xk)]2 = 1 – 0 = 1
n n
Bn = ∑ Var ( X k ) = ∑ 1 = n
i =1 i =1

Bn n 1
∴ lim 2 = lim = lim = 0.
n→∞ n x→∞ 2 x→∞ n
n
Hence, weak law of large numbers holds for the sequence {Xk} of
independent random variables.
Example 7.8: Examine whether the laws of large numbers holds for the sequence
{Xk} independent random variables which are defined as follows:
1
P(Xk = ± k–1/2) =
2
Solution: E(Xk) = ΣXk pk
1 1
–
1 – 1
= k × + (−k 2 ) × = 0
2
2 2
– 1/ 2 2 1 1
E(Xk2) = Σxk2 . pk = ( k ) × + ( − k – 1/ 2 ) 2 ×
2 2
1 –1 1 –1 –1
= ×k + k =k
2 2
Var. (Xk) = E(Xk ) – [E(Xk)]2 = k–1 – 0 = k–1
2

n n
Bn = ∑ Var (X k ) = ∑ k–1 = n . k–1
i =1 i =1

Bn n . k–1 k–1
∴ lim = = lim =0
n → ∞ n2 n2 n→∞ n

Self-Instructional Material 239

Approximation Theory Hence, the laws of large numbers hold for the sequence {Xk} of independent
random variables.

NOTES 7.6 NORMAL APPROXIMATION

Under certain circumstance one can use normal distribution to approximate a

binomial distribution as well as Poisson distribution.
If X ~ B(n, p) and value of n is quite large with p quite close to ½, then, X
can be approximated as N (np, npq), where q = 1 – p.
There are cases in which use of normal distribution is found to be easier
than that of a Binomial distribution.
Also, as already told, normal distribution may also be utilized for
approximating Poisson distribution when value of λ is large. Here, λ is the mean of
Poisson distribution.
Thus, X ~ Po(λ) → X ~ N(λ, λ) approximately when values of λ is large.
Continuity Correction
Normal distribution is continuous whereas both the distributions, Binomial as well
as Poisson are discrete random variables. This fact has to be kept in mind while
making use of normal distribution for approximating Binomial or Poisson distribution
and use continuity correction.
Each probability, in case of discrete distribution, is represented using a
rectangle as shown in the figure 7.3(b).

Figure 7.3 (a) Continuous distribution

Figure 7.3 (b) Discrete Distribution

240 Self-Instructional Material
While working out for probabilities, we like inclusion of the whole rectangles Approximation Theory

in applying continuity correction.

Example 7.9: A fair coin is tossed 20 times. Find the probability of getting between
9 and 11 heads.
NOTES
Solution: Let the random variable be represented by X that shows the number of
heads thrown.
X ~ Bin(20, ½)
As p is very near to ½, normal approximation can be used for the binomial
distribution and we may write as X ~ N(20 × ½, 20 × ½ × ½) → X ~ N(10, 5).
In the diagram as shown in below rectangles show binomial distribution
which is discrete and curve shows normal distribution which is continuous in nature.

Using normal distribution for showing Binomial distribution

If it is desired to have P(9 ≤ X ≤ 11) as shown by shaded area one may
note that first rectangle begins at 8.5 and last rectangle terminates at 11.5. By
making a continuity correction, probability becomes P(8.5 < X < 11.5) in normal
distribution. We may standardize this, as given below:

 9 − 10 X – 10 11 − 10 
= P < < 
 5 5 5 

= P (– 0.447 < Z < 0.447)

= 2 × 0.67 – 1 (using tables) = 0.34

CHECK YOUR PROGRESS

4. Name some of the discrete probability distributions.
5. Define a standard normal variate.
6. Who defined the set of Kolmogrov's inequalities and when?
7. Under what circumstances normal approximation can be applied to binomial
and Poission distribution.

Self-Instructional Material 241

Approximation Theory
7.7 SUMMARY

In this unit, you have learned that:

NOTES • Approximation theory deals with two types of problem
– when a function is given explicitly and you want to find a simpler type
such as polynomial for representation
– the problem concerns fitting function for the given data and finding the
best function in a certain class that is used to represent the data.
• There are two ways to approximate a function by using a polynomial:
– Using truncated Taylor’s series representation
– Using Chebyshev polynomials
• Normal distribution if used for:
– Statistical inference theory
– Calculation of mathematical properties by using the central limit
theorem

7.8 KEY TERMS

• Differential equation: An equation that explains the association between

a function and its derivatives.
• Polynomial: A mathematical expression that includes a sum of terms, where
each term consists of a variable or variables that are raised to a power and
multiplied by a coefficient .
• Standard normal variate: A normal variate which has zero as its mean
and unity as its standard deviation.

7.9 ANSWERS TO ‘CHECK YOUR PROGRESS’

1. The two types of problem, that approximation theory deals with are (i)
when a function is given explicitly and you want to find a simpler type such
as polynomial for representation and (ii) the problem concerns fitting function
for the given data and finding the best function in a certain class that is used
to represent the data.
2. The two ways to approximate a function using a polynomial are as follows:
(i) Using truncated Taylor’s series
(ii) Using Chebyshev polynomials
3. The first few Chebyshev polynomials can be summarized as follows:
T0 ( x) = 1
T1 ( x) = x

242 Self-Instructional Material

T2 ( x ) = 2x2 – 1
Approximation Theory

T3 ( x) = 4x3 – 3x
T4 ( x ) = 8x4 – 8x2 + 1
NOTES
5 3
T5 ( x) = 16x – 20x + 5x
T6 ( x) = 32x6 – 48x4 + 18x2 – 1
4. Binomial, poisson, negative binomial and uniform distribution are some of
the discrete probability distributions.
5. A normal variate with mean zero and standard deviation unity, is called a
standard normal variate.
6. The set of Kolmogorov’s inequalities was defined by Kolmogorov in 1928.
1
7. When number of trials is large and probability p is close to , normal
2
approximation can be used to for binomial as well as poission distribution.

7.10 QUESTIONS AND EXERCISES

Short-Answer Questions
1. Write first three terms of Taylor Series for sin x.
2. When a random variable is send to be normally distributed.
3. Give one example of normal distribution.
4. A fair coin is tossed 30 times. Find probability of getting 8 and 12 heads.
Long-Answer Questions
1. Express the following as polynomials in x:
1 1
(i) T0 ( x) + 2T1 ( x) + T2 ( x) (ii) 2T0 ( x) − T2 ( x) + T4 ( x)
4 8
(iii) 5T0 ( x) + 2T1 ( x) + 4T2 ( x) − 8T3 ( x)
2. Express the following in terms of Tn(x):
(i) 1 + x − x 2 + x3 − x 4
(ii) 1 − x 2 + 2 x 4 − 4 x6
(iii) 1 − x + x 2 − x3 + x 4
3. The marks obtained by a number of students in certain subjects are assumed
to be approximately normally distributed with mean µ = 65 and
S.D. = 5. If three students are taken at random from this set, what is the
probability that exactly two of them will have marks over 70?
4. Compute cos x correctly upto three significant digits. Obtain a series with
minimum number of terms using Taylors series and Chebyshev series.

Self-Instructional Material 243

Approximation Theory 5. Obtain a Taylor series expansion of ex and represent it in terms of Chebyshev
polynomials.
6. Show that for each Chebyshev polynomial Tn(x), you have
NOTES 1 [Tn ( x)]2 π
∫−1 dx = .
2
1 − x2
7. Describe the laws of large numbers.
8. Prove the central unit theorem.

7.11 FURTHER READING

Sivayya, K.V. and K. Satya Rao. Business Mathematics. Guntur: Technical

Publishers.
Sancheti, D.C. and V.K. Kapoor. Business Mathematics. New Delhi: Sultan
Chand & Sons.
Chance, William A. 1969. Statistical Methods for Decision Making. Illinois:
Richard D Irwin.
Trivedi, K.S., Probability and Statistics with Reliability, New Delhi: PHI, 1994
Levin, Richard. I., and David. S. Rubin. 1997. Statistics for Management, 7th
edition, New Jersey: Prentice Hall International.
Gupta, S.C., and V.K. Kapoor. 1997. Fundamentals of Mathematical Statistics.
9th Revised Edition, New Delhi: Sultan Chand & Sons.
Goon, A.M., M.K. Gupta, and B. Das Gupta. 1983. Fundamentals of Statistics.
Vols. I & II, Calcutta: The World Press Pvt. Ltd.

244 Self-Instructional Material

Statistical Inferences

UNIT 8 STATISTICAL INFERENCES

Structure NOTES
8.0 Introduction
8.1 Unit Objectives
8.2 Sampling Theory
8.2.1 The Two Concepts: Parameter and Statistic
8.2.2 Objects of Sampling Theory
8.2.3 Sampling Distribution
8.2.4 The Concept of Standard Error (or S.E.)
8.2.5 Procedure of Significance Testing
8.3 Test of Hypothesis
8.4 Test of Significance
8.4.1 Two-Tailed and One-Tailed Test
8.4.2 Testing a Hypothesis
8.5 Regression
8.5.1 Regression: Definition
8.5.2 Regression Coefficients
8.5.3 Non-Linear Regression
8.5.4 Multiple Linear Regression
8.5.5 Goodness of Fit
8.5.6 Estimate
8.6 Summary
8.7 Key Terms
8.8 Answers to ‘Check Your Progress’
8.9 Questions and Exercises
8.10 Further Reading

8.0 INTRODUCTION

In this unit you will learn about statistical inferences. Population in statistics does
not necessarily mean human population alone. It means a complete set of objects
under study, living or non-living. Each individual or object is called unit or member
or element of that population. If there are a finite number of elements then it is
called finite population. If the number of elements is infinite then it is called infinite
population. The procedure of testing the validity of our hypothesis using sample
statistics is called Testing of Hypothesis.
The techniques of regression analysis are employed in statistics for modeling
and analysing several variables, when the focus is on the relationship between a
dependent variable and one or more independent variables. More particularly,
regression analysis helps us understand how the typical value of the dependent
variable changes when any one of the independent variables is varied, while the
other independent variables are held fixed. Generally, regression analysis estimates
the conditional expectation of the dependent variable given the independent

Self-Instructional Material 245

Statistical Inferences variables—that is, the average value of the dependent variable when the independent
variables are held fixed. The focus is less frequent on a quartile, or other location
parameter of the conditional distribution of the dependent variable given the
independent variables. Mostly, the estimation target is a function of the independent
NOTES variables called the regression function. In regression analysis, we also undertake
to characterize the variation of the dependent variable around the regression
function, which can be described by a probability distribution.

8.1 UNIT OBJECTIVES

After going through this unit, you will be able to:

• Understand sampling theory and testing of hypothesis
• Describe test of significance for all types of samples
• Understand what is regression
• Explain regression coefficients
• Elaborate Non-linear Regression
• Examine Multiple linear Regression
• Understand what is Goodness of Fit
• Explain Standard Error of Estimate

8.2 SAMPLING THEORY

A universe is the complete group of items about which knowledge is sought. The
universe may be finite or infinite. Finite universe is one which has a definite and
certain number of items but when the number of items is uncertain and infinite, the
universe is said to be an infinite universe. Similarly the universe may be hypothetical
or existent. In the former case the universe in fact does not exist and we can only
imagine the items constituting it. Tossing of a coin or throwing of a dice are examples
of hypothetical universes. Existent universe is a universe of concrete objects, i.e.,
the universe where the items constituting it really exist. On the other hand, the term
sample refers to that part of the universe which is selected for the purpose of
investigation. The theory of sampling studies the relationships that exist between
the universe and the sample or samples drawn from it.
8.2.1 The Two Concepts: Parameter and Statistic
It would be appropriate to explain the meaning of two terms viz., parameter and
statistic. All the statistical measures based on all items of the universe are termed
as parameters whereas statistical measures worked out on the basis of sample
studies are termed as sample statistics. Thus, a sample mean or a sample standard
deviation is an example of statistic whereas the universe mean or universe standard
deviation is an example of a parameter.
246 Self-Instructional Material
The main problem of sampling theory is the problem of relationship between Statistical Inferences

a parameter and a statistic. The theory of sampling is concerned with estimating

the properties of the population from those of the sample and also with gauging the
precision of the estimate. This sort of movement from particular Sample towards
general Universe is what is known as statistical induction or statistical inference. In NOTES
more clear terms, ‘from the sample we attempt to draw inferences concerning the
universe. In order to be able to follow this inductive method, we first follow a
deductive argument which is that we imagine a population or universe (finite or
infinite) and investigate the behaviour of the samples drawn from this universe
applying the laws of probability.’ The methodology dealing with all this is known as
Sampling Theory.
8.2.2 Objects of Sampling Theory

Sampling theory is to attain one or more of the following objectives:

(a) Statistical Estimation. Sampling theory helps in estimating unknown
population quantities or what are called parameters from a knowledge of
statistical measures based on sample studies often called as ‘statistic’. In
other words, to obtain the estimate of parameter from statistic is the main
objective of the sampling theory. The estimate can either be a point estimate
or it may be an interval estimate. Point estimate is a single estimate
expressed in the form of a single figure but interval estimate has two limits,
the upper and lower limits. Interval estimates are often used in statistical
induction.
(b) Tests of Hypotheses or Tests of Significance. The second objective of
sampling theory is to enable us to decide whether to accept or reject
hypotheses or to determine whether observed samples differ significantly
from expected results. The sampling theory helps in determining whether
observed differences are actually due to chance or whether they are really
significant. Tests of significance are important in the theory of decisions.
(c) Statistical Inference. Sampling theory helps in making generalization about
the universe from the studies based on samples drawn from it. It also helps
in determining the accuracy of such generalizations.
8.2.3 Sampling Distribution
In sampling theory we are concerned with what is known as the sampling
distribution. For this purpose we can take certain number of samples and for each
sample we can compute various statistical measures such as mean, standard
deviation etc. It is to be noted that each sample will give its own value for the
statistic under consideration. All these values of the statistic together with their
relative frequencies with which they occur, constitute the sampling distribution.
We can have sampling distribution of means or the sampling distribution of standard
deviations or the sampling distribution of any other statistical measure. The sampling
Self-Instructional Material 247
Statistical Inferences distribution tends quite closer to the normal distribution if the number of samples is
large. The significance of sampling distribution follows from the fact that the
mean of a sampling distribution is the same as the mean of the universe.
Thus, the mean of the sampling distribution can be taken as the mean of the
NOTES universe.
8.2.4 The Concept of Standard Error (or S.E.)
The standard deviation of sampling distribution of a statistic is known as its standard
error and is considered the key to sampling theory. The utility of the concept of
standard error in statistical induction arises on account of the following reasons:
(a) The standard error helps in testing whether the difference between observed
and expected frequencies could arise due to chance. The criterion usually
adopted is that if a difference is upto 3 times the S.E. then the difference is
supposed to exist as a matter of chance and if the difference is more than 3
times the S.E., chance fails to account for it, and we conclude the difference
as significant difference. This criterion is based on the fact that at x ± 3(S.E.),
the normal curve covers an area of 99.73 per cent. The product of the
critical value at certain level of significance and the S. E. is often described
as the Sampling Error at that particular level of significance. We can test the
difference at certain other levels of significance as well depending upon our
requirement.
(b) The standard error gives an idea about the reliability and precision of a
sample. If the relationship between the standard deviation and the sample
size is kept in view, one would find that the standard error is smaller than
the standard deviation. The smaller the S.E. the greater the uniformity of
the sampling distribution and hence greater is the reliability of sample.
Conversely, the greater the S.E., the greater the difference between
observed and expected frequencies and in such a situation the unreliability
of the sample is greater. The size of S.E. depends upon the sample size;
the greater the number of items included in the sample the smaller the
error to be expected and vice versa.
(c) The standard error enables us to specify the limits, maximum and minimum,
within which the parameters of the population are expected to lie with a
specified degree of confidence. Such an interval is usually known as
confidence interval. The degree of confidence with which it can be asserted
that a particular value of the population lies within certain limits is known
as the level of confidence.

8.2.5 Procedure of Significance Testing

The following sequential steps constitute, in general, the procedure of significance
testing:
(a) Statement of the Problem. First, the problem has to be stated in clear
terms. It should be quite clear as to in respect of what the statistical decision
248 Self-Instructional Material
has to be taken. The problem may be, Whether the hypothesis is to be Statistical Inferences

rejected or accepted? Is the difference between a parameter and a statistic

significant? or the like ones.
(b) Defining the Hypothesis. Usually, we start with the null hypothesis
according to which it is presumed that there is no difference between a NOTES
parameter and a statistic. If we are take a decision whether the students
have been benefited from the extra coaching and if we start with the
supposition that they have not been benefited then this supposition would
be termed as null hypothesis which in symbolic form is denoted by H0.
As against null hypothesis, the researcher may as well start with some
alternative hypothesis, (symbolically H1) which specifies those values that
the researcher believes to hold true and then may test such hypothesis on
the basis of sample data. Only one alternative hypothesis can be tested
at one time against the null hypothesis.
(c) Selecting the Level of Significance. The hypothesis is examined on a
pre-determined level of significance. Generally, either 5 per cent level or
1 per cent level of significance is adopted for the purpose. However, it
can be stated here that the level of significance must be adequate keeping
in view the purpose and nature of enquiry.
(d) Computation of the Standard Error. After determining the level of
significance the standard error of the concerning statistic (mean, standard
deviation or any other measure) is computed. There are different formulae
for computing the standard errors of different statistic. For example, the
Standard Deviation
Standard Error of Mean = , the standard error of
n
Standard Deviation
Standard Deviation = , the standard error of Karl
2n
1− r2
Pearson’s Coefficient of Correlation = and so on. (A detailed
n
description of important standard errors formulae has been given on the
pages that follow).
(e) Calculation of the Significance Ratio. The significance ratio, symbolically
described as z, t, f etc. depending on the test we use, is often calculated
by diving the difference between a parameter and a statistic by the standard
error concerned. Thus, in context of mean, of small sample when population
x−µ
variance is not known, in context of t = and in context of difference
SE x
X1 − X 2
between two sample means t = . (All this has been fully
SE diff x − x
1 2

Self-Instructional Material 249

Statistical Inferences explained while explaining sampling theory in respect of small samples of
variables later in this unit).
(f) Deriving the Inference. The significance ratio is then compared with the
predetermined critical value. If the ratio exceeds the critical value then the
NOTES difference is taken as significant but if the ratio is less than the critical value,
the difference is considered insignificant. For example, the critical value at
5 per cent level of significance is 1.96. If the computed value exceeds 1.96
then the inference would be that the difference at 5 per cent level is
significant and this difference is not the result of sampling fluctuations but the
difference is a real one and should be understood as such.

8.3 TEST OF HYPOTHESIS

In statistics, population does not mean human population alone. A complete set of
objects under study, living or non-living is called population or universe for example,
graduates in Pondy, Bajaj tube lights, Ceat car tyres etc. Each individual or object
is called a unit or member or element of that population. If there are a finite number
of elements then it is called finite population. If the number of elements is infinite
then it is called infinite population. Take certain examples to know the parameters
of a population like, average life span of a Bajaj bulb, coverage mileage of ceat
car tyre, percentage of defective products of a company etc. Suppose you want
to know the average mileage of Ceat car tyre. All the tyres manufactured by that
company will constitute the population. Each tyre is run and the average mileage is
calculated which is known as the population average. This average calculated for
the entire population is called a parameter. Similarly any constant calculated for
the whole population is called a parameter.
Suppose there are N tyres and all of them are moving. Let X1, X2, ... Xn be
the mileages of N tyres. Now these N numbers constitute a population. These N
values are the values of a single variable X namely mileage. This variable X may
have a probability density function f(X, θ) where θ is an unknown parameter. This
Pdf f(X, θ) describes the population completely when θ is known. Now, this f(X,
θ) is itself considered as a population. Since it is not possible to test each and
every tyre, we do not get the data from that complete population and hence the
parameter cannot be calculated. When it is unknown, we may assume that the
form of f(X, θ) is known. When the parameter θ is unknown an assumption of
possible value of θ is called hypothesis. To test the validity of the hypothesis we
make use of sample observations and statistics. The procedure of testing the validity
of our hypothesis using sample statistics is called Testing of Hypothesis.
| x −µ|
For testing a hypothesis a value z, where is calculated from the
σ/ n
given data x is sample mean µ is population mean σ is standard deviation z = test
statistics, 1.281 is the z value at 10% significance level, n is sample popular
= n /5
250 Self-Instructional Material
For example, to ensure that 90 per cent of bulbs have more than 975 hours Statistical Inferences

of life
| x − µ|
≤ 1.281
σ/ n
NOTES
| 975 − 1000|
i.e., n = n /5 = 1.281
125

n = 6.405
= 6.4
n = 41 (approximately)

8.4 TEST OF SIGNIFICANCE

A very importent aspect of sampling theory is the study of test of significance

which gives us a ground to decide:
(i) the deviation between the observed sample statistic and the hypothetical
parameter value or
(ii) the deviation between two independent sample statistics.
It is significant or might be attributed to chance or the fluctuations of sampling.
Null and Alternative hypothesis: A statistical hypothesis is a statement about a
population parameter. There are two types of statistical hypothesis:
(i) Null hypothesis (ii) alternative hypothesis
Null hypothesis: It is a hypothesis of no difference from the present state and is
tested to verify its validity. It is generally represented by H0. H0 is always tested on
the basis of sample information which may or may not be consistent with it. If the
sample information is found to be consistent with the null hypothesis it is accepted
otherwise rejected.
Alternative hypothesis: A hypothesis which is accepted when H0 is rejected is
called alternative hypothesis and is represented by H1. For example, if we are
checking the mean of a sample say µ 0 with its population mean then we may
consider H0: as µ = µ 0.
Then, alternative hypothesis may be:
(i) H1 : µ ≠ µ 0 {two-tailed}
(ii) H1 : µ > µ 0 {one-tailed}
(iii) H1 : µ < µ 0
The level of confidence or level of significance expresses the degree of
credibility that may be attached to the result. The probability that we associate
with the interval estimate of a population parameter indicating how confident we

Self-Instructional Material 251

Statistical Inferences are that the interval estimate will include the true population parameter is called
confidence level. Higher probe means more confidence.
Level of Significance is defined as the probability of rejecting the null
hypothesis when it is true, for example five per cent level of significance means that
NOTES
there are about five per cent changer or out of 100, you may reject the null
hypothesis five times when it is true. The probability that a random value of the test
statistic t belongs to the critical region is known as the level of significance.
P[t ∈ w/H0] = α
Critical Region: A region corresponding to a statistic t, in the sample space S
which amounts to rejection H0 is known as critical region or region of rejection.
The region of the sample space S which amounts to the acceptance of H0 is called
acceptance region.
8.4.1 Two-Tailed and One-Tailed Test
A two-tailed test rejects the null hypothesis if the sample mean is either more or
less than the hypothesized value of the mean of the population. It is considered to
be apt when null hypothesis is of some specific value whereas alternate hypothesis
is not equal to the value of null hypothesis. In a two-tailed curve there are two
rejection regions, also called critical regions (refer Figure 8.1).
Acceptance and rejection
regions in case of a two-tailed
test
(With 5% significance level)
Rejection Rejection
region Acceptance region (Accept H0 region
if the sample mean X falls in
this region)
LIMIT
LIMIT

0.475 of 0.475 of
area area

{Both taken together equals

0.95 or 95% of area}

Z = –1.96 µ2H0 = µ Z = 1.96

Reject H0 if the sample

mean ( X ) falls in either
of these two regions

Figure 8.1 Rejection Regions in Two Tailed Curve

252 Self-Instructional Material

Conditions for the Occurrence of One-tailed Test Statistical Inferences

When the population mean is either lower or higher than some hypothesised value,
one-tailed test is considered to be appropriate where the rejection is only on the
left tail of the curve. This is known as left-tailed test (refer Figure 8.2).
NOTES

Figure 8.2 Left-Tailed Text

For example, what will happen if the acceptance region is made larger? α will
decrease. It will be more easily possible to accept H0 when H0 is false (Type II
error), i.e., it will lower the probability by making a Type I error, but raise that of
β, Type II error.

Note: α, β are probabilities of making an error; 1 – α, l – β are probabilities of

making correct decisions.

Example 8.1: Can we say α + β = 1?

Solution: No. Each is concerned with a different type of error. But both are not
independent of each other.

Self-Instructional Material 253

Statistical Inferences Committing Errors : Type I and Type II:
In how many ways can we commit errors?
We reject a hypothesis when it may be true. This is a Type I Error.
NOTES We accept a hypothesis when it may be false. This is a Type II Error.
The other true situations are desirable:
We accept a hypothesis when it is true. We reject a hypothesis is when it is
false.
Accept H0 Reject H0
H0 Accept True H0 Reject True H0
True Desirable Type I Error
H0 Accept False H0 Reject False H0
False Type II Error Desirable

The level of significance implies the probability of type I error. A 5 per cent
level implies that the probability of committing a type I error is 0.05. A 1 per cent
level implies 0.01 probability of committing type I error.
Lowering the significance level and hence the probability of type I error is
good, but unfortunately it would lead to the undesirable situation of committing
type II error.
Type I Error: Reject H0 when it is true.
Type II Error: Accept H0 when it is wrong, i.e., reject H1 when it is true.
In practice, type I error amounts to rejecting a lot when it is good and
type II error may be regarded as accepting a lot when it is bad.
Thus P(rejecting a lot when it is good) = α
P(accepting a lot when it is bad) = β
where α and β are referred to as Producer’s risk and consumer’s risk respectively.
Power of Test: A good test should accept H0 when it is true and reject H0 when
it is false. Power of test or (1 – β) measures how well the test is working and is
called the power of test.
No. of sample pts. in the sample leading
to accept H 0 when it is false
Power of rest = (1 – β) =
Total no. of sample pts.
Degree of freedom: For a fixed value of the mean the number of free choices is
called the degree of freedom.
8.4.2 Testing a Hypothesis
Step 1: Set up H0. The following steps facilitates the testing of a hypothesis.

254 Self-Instructional Material

Step 2: Setup an alternative hypothesis H1. This will enable us to decide whether Statistical Inferences

we have to use two-tailed test or one-tailed test.

Step 3: Choose an appropriate level of significance ‘2’. This is to be decided
before a sample is drawn.
NOTES
Step 4: Compute the test statistic
t − E (t )
Z=
S .E. (t )
under the null hypothesis.
Step 5: We compare Z with the significant value (tabulated value)
Z θ at the given significance level ‘α’.
If | z | < zα, we say it is not significant. By this we mean that the difference
t – E(t) is just due to fluctuations of sampling and the sample data do not provide
us sufficient evidence against the H0 which may therefore, be accepted.
If | z | > zα, then reject H0 with confidence coefficient (1 – α).
The critical value of zα of the test statistic at level of significance α for a
two-tailed test is given by
p(| z | > zα) = α ...(8.1)
i.e., zα is the value of z so that the total area of the critical region on both
tails is α. Since the normal curve is symmetrical, from equation (8.1), we get
p ( z > zα ) + p ( z < − zα ) = α; i.e., 2 p ( z > zα ) = α; p ( z > zα ) = α/2
i.e., the area of each tail is α/2.
The critical value zα is that value such that the area to the right of zα is
α/2 and the area to the left of –zα is α/2.
In the case of the one-tailed test,
p(z > zα) = α if it is right-tailed; p(z < –zα) = α if it is left-tailed.
The critical value of z for a single-tailed test (right or left) at level of significance
α is the same as the critical value of z for two-tailed test at level of significance 2α.
Using the equation, also using the normal tables, the critical value of z at
different levels of significance (α) for both single tailed and two-tailed test are
calculated and listed below. The equation is:
p (| z | > zα ) = α; p ( z > zα ) = α; p ( z < zα ) = α
Table 8.1 Level of Significance

1%(0.01) 5%(0.5) 10%(0.1)

Two-tailed test | z α | = 2.58 | z | = 1.966 | z | = 0.645
Right tailed z α = 2.33 z α = 1.645 z α = 1.28
Left tailed z α = –2.33 z α = –1.645 z α = –1.28
Self-Instructional Material 255
Statistical Inferences
t − E (t )
Here, the test statistic is z = ~ N (0, 1) for t.
S · E (t )

NOTES CHECK YOUR PROGRESS

1. What would you use to test the validity of hypothesis?
2. What is one very important aspect of sampling theory?
3. What is the function of a two-tailed test?

8.5 REGRESSION

8.5.1 Regression: Definition

Regression, means ‘stepping back towards the average’.
Definition: Regression analysis is a mathematical measure of the average
relationship between two or more variables in terms of the original units of the data.
There are two types of variables (i) whose value is influenced or is to be
predicted is called dependent variable, and (ii) variable which influences the values
or is used for prediction is called independent variable.
Line of Regression: If the variables in a bivariate distribution are related, we will
find that the points in the scatter diagram will cluster around some curve called the
‘curve of regression’. If the curve is a straight line, it is called the line of regression
and there is said to be linear regression between the variables, otherwise regression
is said to be curvilinear.
The line of regression is the line which gives the best estimate to the value of
one variable for any specific value of the other variable. Thus, the line of regression
is the line of ‘best fit’ and is obtained by the principles of least square.
Let us suppose, in a bivariate distribution (xi, yi), i = 1, 2, ... n; Y is dependent
and X is independent variable. Let the line of regression be
Y = a + b.X
according to the principle of least square the normal equations for estimating a and
b are
n n
∑ yi = na + ∑ bxi ...(8.2)
i =1 i =1

n n n
and ∑ xi yi = a ∑ xi + b ∑ xi2 ...(8.3)
i =1 i =1 i =1

divide Equation (8.2) by n, we get

y = a + bx ...(8.4)
256 Self-Instructional Material
y and x are the mean values of X and Y. Thus, the line of regression of Y on Statistical Inferences

X passes through the points ( x , y ) . Now,,

n n
1 1
Cov (X, Y) =
n
∑ xi yi − x y ⇒ n0
∑ x y yi = x y + cor ( x, y) NOTES
i =1 i =1

n n
1 1
also σX2 =
n
∑ xi2 − x 2 ⇒ n0
∑ xi2 = σ2x + x 2
i =1 i =1

divide Equation (8.3) by n and using the above relations, we obtained

Cov (x, y ) + x y = ax + b(σ 2X + x 2 ) ...(8.5)
multiply Equation (8.4) by x
2
x y = ax + bx ...(8.6)
Equation (8.5) – Equation (8.6)
⇒ Cov(x, y) = b . σ 2X

Cov (x, y )
⇒ b=
σ2X
Solving Equations (8.2) and (8.3), we get
n Σxi yi − Σxi . Σyi
bˆ =
nΣxi2 − (Σxi ) 2

and ˆ .
â = y − bx
where â and bˆ are the least square estimates of (a, b) therefore the least squared
estimated value yˆi of yi is given by

yˆi = aˆ + bˆ xi .
Since b is the slope of the line of regression of Y on X and since the line of
regression passes through the point ( x , y ) , its equation is
Cov (X , y )
Y − y = b( X − x ) = (X − x)
σ2X

σY
⇒ Y − y = r. (X − x)
σX

Similarly, the line of regression of X on Y is

σX
X − x = r. (Y − y )
σY
Self-Instructional Material 257
Statistical Inferences where, r is the regression coefficient.

Cov (X , Y ) nΣxi yi − Σxi . Σyi

r(X, Y) = =
σ X . σY nΣxi2 − (Σxi ) 2 . nΣyi2 − (Σyi )2
NOTES
where, Cov(X, Y) = E(XY) – E(X) . E(Y)
σX = Var(X) = E(X2) – E(X)
σY = Var(Y) = E(Y2) – E(Y).
In case of perfect correlation, positive or negative i.e., r = ±1, both the lines
of regression coincide. Therefore, in general, we have two lines of regression
except in the particular case of perfect correlation when both the lines coincide
and we get only one line.
(i) If θ is the obtuse angle between the two lines of regression, then

r 2 − 1 σ X . σY
tan θ = .
r σ2X + σY2

π
(ii) If r = 0, then tan θ = ∞ and θ = , i.e., if the two variables are not
2
related then the lines of regression becomes perpendicular to each other.
(iii) If r = ±1 ⇒ tan θ = 0 ⇒ θ = 0°. That is, the lines of regression will
concide.
8.5.2 Regression Coefficients
Regression coefficients of Y on X and X on Y are given as:
σY nΣxy − (Σx) (Σy )
byx = r . =
σX nΣx 2 − (Σx)2

σ X nΣxy − (Σx) (Σy )

and bxy = r . =
σY nΣy 2 − (Σy ) 2
Some properties of regression coefficients are given below:
(i) The correlation coefficients between two variables X and Y is the
geometrical mean of the two regression coefficients byx and bxy.
i.e., bxy . byx = r2
or, r= bxy . byx

where, r is the correlation coefficient.

(ii) The arithematic mean of the regression coefficients is greater than the
correlation coefficients

258 Self-Instructional Material

Statistical Inferences
bxy + byx
i.e., >r
2
(iii) If bxy > 1, then byx < 1 and vice versa.
(iv) If θ is the acute angle between bxy and byx, then NOTES

1 − r 2 σ X σY
tan θ = . as r2 ≤ 1
r σ2X + σY2
Example 8.2: The equation of two regression lines, obtained in a correlation
analysis of 60 observations are
5x = 6y + 24 and 1000 y = 768x – 3608
what is the correlation coefficient? Show that the ratio of coefficient of variability
of x to that of Y is 5/24. What is the ratio of variances of x and y?
Solution: 5x = 6y + 24 ⇒ x = 6/5y + 24/5
bxy = 6 / 5 .

σX
r = 6/5 ...(1)
σY

768 σY 768
Similarly, byx = ⇒ r = ...(2)
1000 σX 1000
By multiplying Equations (1) and (2), we get
⇒ r2 = 0.9216 ⇒ r = 0.96
divide Equation (2) by (1)
σ2X σX
= 1.5625 ⇒ = 1.25 = 5 / 4
σY2 σY
Since regression lines pass through the points ( x , y )
5x = 6 y + 24 and 1000 y = 768 x − 3608
On solving these two, we have x = 6, y = 1
σX
Coefficient of variability of X =
X
σ
Coefficient of variability of Y = Y
Y
σX Y Y σ  1 5 5
ratio = × =  X =  =
X σY X  σY  6  4  24

Self-Instructional Material 259

Statistical Inferences 8.5.3 Non-Linear Regression
In the previous topic we have already discussed about the linear relationship
between two variables X and Y. This relationship between two variables may not
NOTES be linear. It may be of second degree of the form y = a + bx + cx2 or even of
higher degree. Here also we can apply the method of least squares to find the
estimates of a, b and c for the relationship y = a + bx + cx2
Therefore, e = Σ( yi − a − bxi − cxi2 ) 2 should be minimum.
∂e
i.e., = 2Σ( yi − a − bxi − cxi2 ) ( −1) = 0.
∂a
∂e 2
= 2Σ( yi − a − bxi − cxi ) (− xi ) = 0.
∂b
∂e 2 2
and = 2Σ( yi − a − bxi − cxi ) (− xi ) = 0.
∂c
2
or, Σyi = na + bΣxi + cΣxi
2 3
Σxi yi = aΣxi + bΣxi + cΣxi

and Σxi2 yi = aΣxi2 + bΣxi3 + cΣxi4

These equations are known as normal equations whose solution will give
the values of a, b and c for the non-linear regression equation.
Example 8.3: Fit a second degree parabola to the following data.

x: 0 1.0 2 3 4
y: 1 4 10 17 30 .

Solution: Let, y = a + bx + cx2

normal equation will be
2
Σy = an + bΣx + cΣx
2 3
Σxy = aΣx + bΣx + cΣx
2 3 4
Σx 2 y = aΣx + bΣx + cΣx

x y x2 x3 x4 xy x2 y
0 1 0 0 0 0 0
1 4 1 1 1 4 4
2 10 4 8 16 20 40
3 17 9 27 81 51 153
4 30 16 64 256 120 480

260 Self-Instructional Material

Σx = 10 Σy = 62 Σx2 = 30 Σx3 = 100 Σx4 = 354 Σxy = 195 Σx2y = 677 Statistical Inferences

So, 62 = 5a + 10b + 30c, 195 = 109 + 30b + 100c

and 677 = 30a + 100b + 354c
on solving these, we have a = 1.2, b = 1.1 and c = 1.5 NOTES
so, y = 1.2 + 1.1x + 1.5x2
8.5.4 Multiple Linear Regression

Till now we have considered the case where dependent variable is a function of
single independent variable, i.e., y = f(x) type, it is a case of simple regression.
Here, we will consider a case where the dependent variable is a function of two or
more independent variables, i.e., y = f(x1, x2, ... xn) type. We call this type of
regression as multiple regression.
Although there are many different formulae to express this type of regression,
the most widely used are linear equation of the form
y = a + bx + cz
where a, b and c are regression coefficients which are to be determined. Applying
the method of least squares to determine these coefficients we minimize the sum of
square of the deviations.
That is,
e = Σ( yi − a − bxi − czi ) 2
Differentiating with respect to a, b and c we have,
∂e
= 2Σ( yi − a − bxi − czi ) (−1) = 0.
∂a
∂e
= 2Σ( yi − a − bxi − czi ) (− xi ) = 0.
∂b
∂e
and = 2Σ( yi − a − bxi − czi ) (− zi ) = 0.
∂c
On simplification, we get
or Σyi = n . a + bΣxi + cΣzi
2
Σxi yi = aΣxi + + bΣxi + cΣxi zi
and Σzi yi = a Σ zi + bΣxi zi + cΣ zi2
This is a system of three linear equations in three unknowns. On solving, we
will have the values of a, b and c and hence we determine the equation of regression
equation or plane.

Self-Instructional Material 261

Statistical Inferences Example 8.4: The following marks have been obtained by a class of students in
maths (out of 100)
Paper I 45 55 56 58 60 65 68 70 75 80 85
NOTES
Paper II 56 50 48 60 62 64 65 70 74 82 90

Find the equation of the lines of regression.

Solution: Taking Ax = 65 and Ay = 70 as assumed means, we make the following
table:
Paper I Paper II
x X X2 y Y Y2 XY

45 –20 400 56 –14 196 280

55 –10 100 50 –20 400 200
56 –9 81 48 –22 484 198
58 –7 49 60 –10 100 70
60 –5 25 62 –8 64 40
65 0 0 64 –6 36 0
68 3 9 65 –5 25 –15
70 5 25 70 0 0 0
75 10 100 74 4 16 40
80 15 225 82 12 144 180
85 20 400 90 20 400 400
ΣX = 2 ΣX 2 Σy = 11 ΣY = –49 ΣY2 = 1865 ΣXY = 1393
= 1414
from the above values, we have

2(−49)
1393 −
r= 11
 2  
2
492 
1414 −  . 1865 − 
 11   11 

= 0.918
The regression coefficients of X on Y
ΣX . ΣY
ΣXY −
σ n
= r. X = 2
= 00.85
σY 2 (ΣY )
ΣY −
n

262 Self-Instructional Material

Statistical Inferences
ΣX . ΣY
ΣXY −
σY n
and r. = 2
= 0.99
σX 2 (ΣX )
ΣX −
n NOTES

mean of the marks of the paper I:

Σx
x = Ax + = 65.2
n
and mean of the marks of paper II:
ΣY
y = Ay + = 65.55
n
Thus, the line of regression of x on y is
σX
x − x = r. (y − y)
σY
i.e., x – 65.2 = 0.85 ( y − y )
or, x = 0.85 y + 9.48 .
and regression line of y on x is
y – 65.55 = 0.99 (x – 65.2) or y = 0.99 x + 1
Example 8.5: Find the two lines of regression and coefficients of correlation for
the data given below:
n = 18, Σx = 12, Σy = 18,
Σx 2 = 60, Σy2 = 96
and Σxy = 48.
Σx 12
Solution: x = = = 0.67
n 18

Σy 18
y = = =1
n 18
2 2
Σx 2  Σx  Σy 2  Σy 
σX = −  σY = − 
n  n  n  n 

2 2
60  12  96  18 
= −   = 2.88 = − 
18  18  18  18 
= 4.33
Self-Instructional Material 263
Statistical Inferences
so, σX = 2.88 = 1.7

and σY = 4.33 = 2.08

NOTES n . Σxy − Σx . Σy
r=
{nΣx 2 − (Σx ) 2 } .{nΣy 2 − (Σy ) 2 }.

= 0.56.
Regression equation of y on x:
σY
y − y = r. .( x − x )
σX

2.08
y – 1 = 0.56 × (x – 0.67)
1.7

or, y = 0.68 x + 0.54

and regression equation of x on Y is:

σX
x − x = r. ( y − y)
σY

1.7
x – 0.67 = 0.56 ( y − 1) or x = 0.46 y + 0.21
2.08
Example 8.6: Find a non-linear regression for the following data:

x: 1 2 3 4 5 6 7 8 9
y: 2 6 7 8 10 11 11 10 9

Solution: The equation of non-linear regression is given by

y = a + bx + cx2
and the normal equations are:
Σy = na + bΣx + cΣx 2

Σxy = aΣx + bΣx 2 + cΣx3

2 3 4
and Σx 2 y = aΣx + bΣx + cΣx

264 Self-Instructional Material

From, the given data we construct the following table as: Statistical Inferences

n x y x2 x3 x4 xy x2 y
1 1 2 1 1 1 2 2
2 2 6 4 8 16 12 24 NOTES
3 3 7 9 27 81 21 63
4 4 8 16 64 256 32 128
5 5 10 25 125 625 50 250
6 6 11 36 216 1296 66 396
7 7 11 49 343 2401 77 539
8 8 10 64 512 4096 80 640
9 9 9 81 729 6561 81 729
n=9 Σx = 45 Σy = 74 Σx 2 Σx 3 Σx 4 Σxy Σx 2 y
= 284 = 2025 = 15333 = 421 = 2771
on substituting these values in the normal equations, we have
a = –0.923, b = 3.520 and c = –0.267
Hence, the required non-linear regression is;

y = − 0.923 + 3.520 x − 0.267 x 2

Example 8.7: Can y = 5 + 2.8X and X = 3 – 0.5Y be the estimated regression
equations of Y on X and X on Y respectively?
Solution: Line of regression of Y on X is
Y = 5 + 2.8X ⇒ byx = 2.8
and line of regression of X on Y is
X = 3 – 0.5Y ⇒ bxy = –0.5
This is not possible since each of the regression coefficients byx and bxy must
have the same sign, which is same as that of Cov (X, Y). If Cov (X, Y) is positive,
both the regression coefficients are positive and if Cov (X, Y) is negative, then
both the regression coefficients are negative. Hence the given equations cannot be
estimated as regression equations of Y on X and X on Y respectively.
Example 8.8: In a partially destroyed laboratory record of an analysis of
correlation data, the following results only are legible:
Variance of X = 9.
Regression equations: 8X – 10Y + 66 = 0 and 40X – 18Y = 214
find (i) the mean values of X and Y
(ii) the correlation coefficients between X and Y
(iii) the standard deviation of Y.

Self-Instructional Material 265

Statistical Inferences
Solution: (i) Since both regression lines pass through the points ( X , Y ), we have
8 X − 10Y = –66 and 40 X − 18Y = 214
on solving, we have X = 13 and Y = 17
NOTES
(ii) The regression lines can be rewritten as:
8 66 18 214
Y= X+ and X = Y+
10 10 40 40
8 4 18 9
∴ byx = = and bxy = =
10 5 40 20
4 9 9
so, r2 = byx . bxy = . =
5 20 25
3
hence, r =±
5
3
Since both byx and bxy are positive, we take r = + = 0.6
5
σY 4 3 σY
(iii) we have byx = r . ⇒ = .
σX 5 5 9
⇒ σY = 4 Ans.
Example 8.9: Obtain a regression plane by using multiple linear regression to fit
the data given below:

x: 1 2 3 4
y: 0 1 2 3
z: 12 18 24 30
Solution: Using y = a + bx + cz
equation will be 84 = 4a + 10b + 6c
240 = 10a + 30b + 20c and
156 = 6a + 20b + 14c
⇒ a = 10, b = 2 and c = 4.
So, regression plane will be
y = 10 + 2 x + 4 z. Ans.
X Y Z Z2 X2 XY YZ XZ
1 12 0 0 1 12 0 0
2 18 1 1 4 36 18 2
3 24 2 4 9 72 48 6
4 30 3 9 16 120 90 12
ΣX = 10 ΣY = 84 ΣZ = 6 14 30 240 156 20
266 Self-Instructional Material
8.5.5 Goodness of Fit Statistical Inferences

Having developed a regression equation, it is very appropriate to understand how

well this regression line fits the observed data.
(i) Coefficient of determination: The coefficient of determination, R2, is a NOTES
widely used measure of the goodness of fit of a regression line. It measures the
extent or strength of the association that exists between the two variables, X and Y.
Since this value is based on the sample data points, it is known as the ‘Sample
Coefficient of determination’. It’s value ranges from 0 (poor fit) to 1 (good fit) or
(perfect fit). R2 is based on two kinds of variation:
(i) variation of Y values around the fitted regression line and
(ii) variation of Y around their own mean
Thus, the variation of the Y values around the regression line is given by
2
Σ(Y − Yˆ ) 2 and around their own mean is given by Σ(Y − Y ) . Yˆ = a + bXˆ is the
regression line.
'explained variation'
Thus, R2 =
total variation.
total variation = explained + unexplained.
R2 is defined as the proportion of the total variation, i.e., ‘explained’ and
‘unexplained’.
i.e; Σ(Y − Y )2 = Σ(Yˆ − Y ) 2 + Σ(Y − Yˆ )2
total variation − unexplained variation
Thus, R2 =
total variation
Σ(Y − Yˆ )2
=1–
Σ(Y − Y ) 2
R2 gives the proportion of variation in Y that can be accounted for by the
variation in X. Also R2 can be used to measure how well X variable explains Y.
R2 → 1 shows close correlation between X and Y. A simplified formula for
R2 is
âΣY + bˆΣXY − nY 2
R2 =
ΣY 2 − nΣY 2
For example, 1. If r2 = 0.64 ⇒ that only 64% of the variation in the relative series
has been explained by the subject series and the remaining variation is due to other
fallers r2 is non-negative and it does not tell us the direction of relationship between
the two series.
2. When the sample size is small, the estimate of R2 is positively biased, i.e.,
R tends to be the higher side. An unbiased estimate for R2 known as the adjusted
2

coefficient of determination, R 2 is given by

Self-Instructional Material 267
Statistical Inferences
residual variance
R2 = 1 –
total variance

Σ(Yi − Yˆi ) 2 .(n − 1)

NOTES =1–
Σ(Yi − Y ) 2 .(n − 2)
The adjustment factor being (n – 1)/(n – 2) > 1, R2 will always be less than
(n − 1)
or equal to R 2 as the sample size increases, the ratio → 1 and thus, the
(n − 2)
difference between R2 and R 2 will be reduced.
8.5.6 Estimate
The two constants or the parameters viz., ‘a’ and ‘b’ in the regression model for
the entire population or universe are generally unknown and as such are estimated
from sample information. The following are the two methods used for estimation:
(a) Scatter diagram method
(b) Least squares method
Scatter Diagram Method
This method makes use of the Scatter diagram also known as Dot diagram. Scatter
diagram1 is a diagram representing two series with the known variable, i.e.,
independent variable plotted on the X-axis and the variable to be estimated, i.e.,
dependent variable to be plotted on the Y-axis on a graph paper (Refer Figure
8.3) to get the following information:
Income X Consumption Expenditure Y
(Hundreds of Rupees) (Hundreds of Rupees)
41 44
65 60
50 39
57 51
96 80
94 68
110 84
30 34
79 55
65 48

1.
(2) (3) (4) (5)
(1)
Five possible forms, which Scatter diagram may assume has been depicted in the above
five diagrams. First diagram is indicative of perfect positive relationship, Second shows
perfect negative relationship, Third shows no relationship, Fourth shows positive
relationship and Fifth shows negative relationship between the two variables under
consideration.

268 Self-Instructional Material

The scatter diagram by itself is not sufficient for predicting values of the dependent Statistical Inferences

variable. Some formal expression of the relationship between the two variables is
necessary for predictive purposes. For the purpose, one may simply take a ruler and
draw a straight line through the points in the scatter diagram and this way can determine
NOTES
the intercept and the slope of the said line and then the line can be defined as Yˆ = a + bX i ,
with the help of which we can predict Y for a given value of X. But there are shortcomings
in this approach. For example, if five different persons draw such a straight line in the
same scatter diagram, it is possible that there may be five different estimates of a and b,
specially when the dots are more dispersed in the diagram. Hence, the estimates cannot
be worked out only through this approach. A more systematic and statistical method is
required to estimate the constants of the predictive equation. The least squares method
is used to draw the best fit line.
Y-axis
Consumption Expenditure (’ 00 Rs)

120
100
80
60
40
20
X-axis
0 20 40 60 80 100 120

Figure 8.3 Scatter Diagram

Least Square Method

Least squares method of fitting a line (the line of best fit or the regression line)
through the scatter diagram is a method which minimizes the sum of the squared
vertical deviations from the fitted line. In other words, the line to be fitted will pass
through the points of the scatter diagram in such a way that the sum of the squares
of the vertical deviations of these points from the line will be a minimum.
The meaning of the least squares criterion can be easily understood through
reference to Figure 8.4 drawn below, where the earlier figure in scatter diagram has
been reproduced along with a line which represents the least squares line fit to the data.

Figure 8.4 Scatter Diagram, Regression Line and Short Vertical Lines Representing ‘e’
Self-Instructional Material 269
Statistical Inferences In the above figure, the vertical deviations of the individual points from the
line are shown as the short vertical lines joining the points to the least squares line.
These deviations will be denoted by the symbol ‘e’. The value of ‘e’ varies from
one point to another. In some cases it is positive, while in others it is negative. If the
NOTES line drawn happens to be least squares line, then the values of ∑ ei is the least
possible. It is so, because of this feature the method is known as Least Squares
Method.
Why we insist on minimizing the sum of squared deviations is a question that
needs explanation. If we denote the deviations from the actual value Y to the
n
estimated value Yˆ as (Y – Yˆ ) or ei, it is logical that we want the Σ(Y – Yˆ ) or ∑ ei , to
i =1
n
be as small as possible. However, mere examining Σ(Y – Yˆ ) or ∑ ei , is inappropriate,
i =1
since any ei can be positive or negative. Large positive values and large negative
values could cancel one another. But large values of ei regardless of their sign,
n
indicate a poor prediction. Even if we ignore the signs while working out | ei | ,
i =1

the difficulties may continue. Hence, the standard procedure is to eliminate the
effect of signs by squaring each observation. Squaring each term accomplishes
two purposes viz., (i) it magnifies (or penalizes) the larger errors, and (ii) it cancels
the effect of the positive and negative values (since a negative error when squared
becomes positive). The choice of minimizing the squared sum of errors rather than
the sum of the absolute values implies that there are many small errors rather than
a few large errors. Hence, in obtaining the regression line, we follow the approach
that the sum of the squared deviations be minimum and on this basis work out the
values of its constants viz., ‘a’ and ‘b’ also known as the intercept and the slope of
the line. This is done with the help of the following two normal equations:2
ΣY = na + bΣX
ΣXY = aΣX + bΣX2
In the above two equations, ‘a’ and ‘b’ are unknowns and all other values
viz., ∑X, ∑Y, ∑X2, ∑XY, are the sum of the products and cross products to be
calculated from the sample data, and ‘n’ means the number of observations in the
sample.
The following examples explain the Least squares method.
2. If we proceed centering each variable, i.e., setting its origin at its mean, then the two
equations will be as under:
∑Y = na + b∑X
∑XY = a∑X + b∑X2
But since ∑Y and ∑X will be zero, the first equation and the first term of the second
equation will disappear and we shall simply have the following equations:
∑XY = b∑X 2
b = ∑XY/∑X2
The value of ‘a’ can then be worked out as:_ _
a = Y – bX
270 Self-Instructional Material
Example 8.10: Fit a regression line Yˆ = a + bX i by the method of Least squares Statistical Inferences

to the given sample information.

Observations 1 2 3 4 5 6 7 8 9 10
Income (X) (’00 Rs) 41 65 50 57 96 94 110 30 79 65 NOTES
Consumption
Expenditure (Y) (’00 Rs) 44 60 39 51 80 68 84 34 55 48
Solution: We are to fit a regression line Yˆ = a + bX i to the given data by the
method of Least squares. Accordingly, we work out the ‘a’ and ‘b’ values with
the help of the normal equations as stated above and also for the purpose, work
out ∑X, ∑Y, ∑XY, ∑X2 values from the given sample information table on
Summations for Regression Equation.
Summations for Regression Equation

Observations Income Consumption XY X2 Y2

X Expenditure Y
(’00 Rs) (’00 Rs)
1 41 44 1804 1681 1936
2 65 60 3900 4225 3600
3 50 39 1950 2500 1521
4 57 51 2907 3249 2601
5 96 80 7680 9216 6400
6 94 68 6392 8836 4624
7 110 84 9240 12100 7056
8 30 34 1020 900 1156
9 79 55 4345 6241 3025
10 65 48 3120 4225 2304
n = 10 ∑X = 687 ∑Y =563 ∑XY = 42358 ∑X2= 53173 ∑Y2 = 34223

Putting the values in the required normal equations we have,

563 = 10a + 687b
42358 = 687a + 53173b
Solving these two equations for a and b we obtain,
a = 14.000 and b = 0.616
Hence, the equation for the required regression line is,
Yˆ = a + bXi
or, Yˆ = 14.000 + 0.616Xi
This equation is known as the regression equation of Y on X from which Y
values can be estimated for given values of X variable.3
3. It should be pointed out that the equation used to estimate the Y variable values from
values of X should not be used to estimate the values of X variable from given values of
Y variable. Another regression equation (known as the regression equation of X on Y of
the type X = a + bY) that reverses the two value should be used if it is desired to estimate
X from value of Y.
Self-Instructional Material 271
Statistical Inferences Checking the Accuracy of Equation
After finding the regression line as stated above, one can check its accuracy also.
The method to be used for the purpose follows from the mathematical property of
NOTES a line fitted by the method of least squares viz., the individual positive and negative
errors must sum to zero. In other words, using the estimating equation one must
find out whether the term (Y Yˆ ) is zero and if this is so, then one can reasonably
be sure that he has not committed any mistake in determining the estimating equation.
The Problem of Prediction

When we talk about prediction or estimation, we usually imply that if the relationship
Yi = a + bXi + ei exists, then the regression equation, Yˆ = a + bX i provides a base
for making estimates of the value for Y which will be associated with particular
values of X. In Example 6.1, we worked out the regression equation for the income
and consumption data as,

Yˆ = 14.000 + 0.616Xi
On the basis of this equation we can make a point estimate of Y for any
given value of X. Suppose we wish to estimate the consumption expenditure of
individuals with income of Rs 10,000. We substitute X = 100 for the same in our
equation and get an estimate of consumption expenditure as follows:
Yˆ = 14.000 + 0.616 (100) = 75.60

Thus, the regression relationship indicates that individuals with Rs 10,000

of income may be expected to spend approximately Rs 7,560 on consumption.
But this is only an expected or an estimated value and it is possible that actual
consumption expenditure of same individual with that income may deviate from
this amount and if so, then our estimate will be an error, the likelihood of which will
be high if the estimate is applied to any one individual. The interval estimate
method is considered better and it states an interval in which the expected
consumption expenditure may fall. Remember that the wider the interval, the greater
the level of confidence we can have, but the width of the interval (or what is
technically known as the precision of the estimate) is associated with a specified
level of confidence and is dependent on the variability (consumption expenditure
in our case) found in the sample. This variability is measured by the standard
deviation of the error term, ‘e’, and is popularly known as the standard error of
the estimate.
Standard Error of Estimate: This is another way to measure the goodness of
regression line. This shows how precise the prediction of Y is based on the regression
equation of Y on X.

272 Self-Instructional Material

Statistical Inferences
(Y − Yˆ )2
SYX = σˆ =
n−2
The denominator is (n – 2) as two sample estimates (for a and b) have been NOTES
made. This deals with the deviations of the data points about the regression line,
while the standard deviation of Y deals with the deviations of the data points about
its own mean.
The smaller the value of the standard error of estimate, the better is the fit of
equation to the sample data and better are the estimates based on the regression
equation.
If SXY = 0
⇒ perfect match between the observed and estimated values. (it is the case of
perfect correlation).

ΣY 2 − aˆΣY − bˆΣXY
σ̂ = (n − 2)
For example, In a study of how wheat yield depends on the fertilizer input, ten
experimental plots were selected for study. The yields (Y) at different levels of
fertilizer input (X) were observed as given in table.
Fertilizer Yield (Y)
plot no. input (X) kg/acre X2 MT/acre XY Yˆ

1 15 225 15 225 17.295

2 25 625 20 500 19.995
3 30 900 22.5 675 21.345
4 40 1600 25.0 1000 24.045
5 50 2500 28.0 1400 26.745
6 60 3600 30.0 1800 29.445
7 75 5625 32.0 2400 33.495
8 80 6400 35.0 2800 34.845
9 90 8100 37.5 3375 37.545
10 100 10,000 40.0 4000 30.245
n = 10 ΣXi = 565 ΣXi2 = 39,575 ΣYi = 285 ΣXiYi = 18175 ΣYi = 275

Fit a least squared regression line for the sample data.

ΣX i 565
X = = = 56.50
n 10

Self-Instructional Material 273

Statistical Inferences
ΣYi 285.0
Y = = = 28.50
n 10

nΣX iYi − ΣX i ΣYi

NOTES bˆ =
nΣX i2 − (ΣX i ) 2

(10).(18,175) − (565).(285)
= = 0.27
(10).(39,575) − (565) 2

ΣYi bˆΣX i
â = – = 13.245.
n n
Therefore, estimated line of regression is
Yˆi = 13.245 + 0.27Xi
Coefficient of determination R2 is given by

Σ(Y − Yˆ )2 11.717
R2 = 1 – 2
=1−
Σ(Y − Y ) 573.0
= 0.98 ~ 1
The standard error of estimate is given by

Σ(Y − Yˆ )2 11.717
σ̂ = = = 1.21
n−2 8
We can also plot the raw data Y and X and the estimated line on a graph.

CHECK YOUR PROGRESS

4. What is the need for studying regression in statistics?
5. What do you understand by Coefficient of determination?
6. What is a multiple regression?
7. What do you understand by the line of linear regression?

8.6 SUMMARY

In this unit, you have learned that:

• In statistics, population does not mean human population alone. A complete
set of objects under study, living or non-living is called population or universe;
for example, graduates in Pondy, Bajaj tube lights, Ceat car tyres etc.
• Each individual or object is called unit or member or element of that
population. If there are a finite number of elements then it is called finite
274 Self-Instructional Material
population. If the number of elements is infinite then it is called infinite Statistical Inferences

population.
• A very important aspect of sampling theory is the study of test of significance
which gives us a ground to decide the deviation between the observed
NOTES
sample statistic and the hypothetical parameter value or the deviation between
two independent sample statistics.
• A statistical hypothesis is a statement about a population parameter. There
are two types of statistical hypothesis which are null and alternative hypothesis.
• The Level of Confidence or level of significance expresses the degree of
credibility that may be attached to the result. The probability that we associate
with the interval estimate of a population parameter indicating how confident
we are that the interval estimate will include the true population parameter is
called confidence level. Higher probe means more confidence.
• Level of Significance is defined as the probability of rejecting the null
hypothesis when it is time. For example, 5% level of significance means
that there are about 5% changer or out of 100, you may reject the null
hypothesis 5 times, when it is time. The probability that a random value
of the test statistic t belongs to the critical region is known as the level
of significance.
• Regression analysis refers to the techniques used for modeling and analyzing
several variables, when the focus is on the relationship between a dependent
variable and one or more independent variables.
• If the variables in a bi-variate distribution are related, we will find that the
points in the scatter diagram will cluster round some curve called the ‘curve
of regression’.
• If the curve is a straight line it is called line of regression and there is said to
be linear regression between the variables, otherwise regression is said to
be curvilinear.
• Although there are many different formulae to express this type of regression,
the most widely used are linear equation of the form
y = a + bx + cz
where a, b and c are regression coefficients which are to be determined.
Applying the method of least squares to determine these coefficients we
minimize the sum of square of the deviations.

8.7 KEY TERMS

• Population (in statistics): It is a complete set of objects under study,

living or non-living.
• Parameter: It is any constant calculated for the whole population.
Self-Instructional Material 275
Statistical Inferences • Testing of hypothesis: It is the procedure of testing the validity of our
hypothesis using sample statistics.
• Null hypothesis: It is a hypothesis of no difference from the present state
and is tested to verify its validity.
NOTES
• Level of significance: It is the probability of rejecting the null hypothesis
when it is time.
• Critical region: It is a region corresponding to a statistic t, in the sample
space S which amounts to rejection H0 is known as the critical region or
region of rejection.
• Regression analysis: It includes any technique for modeling and analyzing
several variables, when the focus is on the relationship between a dependent
variable and one or more independent variables.
• Curve of regression: If the variables in a bi-variate distribution are related,
we will find that the points in the scatter diagram will cluster round some
curve called the ‘curve of regression’.
• Multiple linear regression: It is the case where the dependent variable is
a function of two or more independent variables, i.e., y = f(x1, x2, ... xn)
type. This type of regression is called as multiple regression.
• Coefficient of determination: R2 is defined as the proportion of the total
variation, i.e., ‘explained’ and ‘unexplained’.

8.8 ANSWERS TO ‘CHECK YOUR PROGRESS’

1. To test the validity of the hypothesis we would make use of sample

observations and statistics.
2. A very important aspect of sampling theory is the study of test of significance
which gives us a ground to decide the deviation between the observed
sample statistic and the hypothetical parameter value or the deviation between
two independent sample statistics.
3. A two-tailed test rejects the null hypothesis if the sample mean is either
more or less than the hypothesized value of the mean of the population.
4. The following are some of the advantages of studying regression in statistics:
• Regression analysis includes any techniques for modeling and analyzing
several variables, when the focus is on the relationship between a
dependent variable and one or more independent variables.
• Regression analysis helps us understand how the typical value of the
dependent variable changes when any one of the independent variables
is varied, while the other independent variables are held fixed.
• Regression analysis estimates the conditional expectation of the
dependent variable given the independent variables — that is, the average
276 Self-Instructional Material
value of the dependent variable when the independent variables are Statistical Inferences

held fixed.
5. The coefficient of determination, R2, is a widely used measure of the goodness
of fit of a regression line. It measures the extent or strength of the association
that exists between the two variables, X and Y. Since this value is based on NOTES
the sample data points, it is known as the ‘Sample Coefficient of
determination’. It’s value ranges from 0 (poor fit) to 1 (good fit) or (perfect
fit).
6. In case the dependent variable is a function of two or more independent
variables, i.e., y = f(x1, x2, ... xn) type, this type of regression is called as
multiple regression.
7. The line of regression is the line which gives the best estimate to the value of
one variable for any specific value of the other variable. Thus, the line of
regression is the line of ‘best fit’ and is obtained by the principles of least
square.

8.9 QUESTIONS AND EXERCISES

Short-Answer Questions
1. What is meant by testing of hypothesis?
2. What is a null hypothesis?
3. What are type I and type II errors?
4. What are one sided test and two sided test?
5. When a sample is called small?
6. What do you understand by regression analysis?
7. What is non-linear regression?
8. Explain briefly coefficients of determination.
9. What do you understand by Standard Error of Estimate?
Long-Answer Questions
1. A company claims that 5% of its products are defective. In a sample of
400 items 320 are good. Test whether the claim is valid.
2. From rural schools 700 children were randomly chosen and examined.
200 of them had eye defects. From urban schools out of 900 children 180
had similar defects. Examine whether more percentage of rural school
children have eye defects.
3. A group of children was tested for measuring mathematical ability. They
were given tuition for a month’s time and second test of equal ability was
given.

Self-Instructional Material 277

Statistical Inferences No 1 2 3 4 5 6 7 8 9 10
Marks in
Test 1 25 40 18 25 29 35 34 41 53 27
NOTES Marks in
Test 2 40 38 20 25 35 32 36 45 50 24
Do the marks give evidence that the students have been benefitted by the
tuition?
4. The following table gives the increase in weights due to diet A and diet B.
Diet A : 10 12 9 15 7 16 15
Diet B : 8 15 8 11 9 10
Test whether there is significant difference between the mean increase in
weights of the diets A and B.
Test whether the variances in increase in weights differ significantly.
5. What is line of regression? Explain briefly.
6. What are regression coefficients? Discuss.
7. What do you understand by multiple linear regression? Explain with an
example.
8. What is co-efficient of determination? Explain.
9. Explain scatter diagram method for estimating regression equation.
10. What is least square method? Explain.

8.10 FURTHER READING

Sivayya, K.V. and K. Satya Rao. Business Mathematics. Guntur: Technical

Publishers.
Sancheti, D.C. and V.K. Kapoor. Business Mathematics. New Delhi: Sultan
Chand & Sons.
Levin, Richard I. 1991. Statistics for Management. New Jersey: Prentice-Hall.
Chance, William A. 1969. Statistical Methods for Decision Making. Illinois:
Richard D Irwin.
Yule, G.U., and M.G. Kendall. 1950. An Introduction to the Theory of Statistics.
London: Griffin.
Trivedi, K.S., Probability and Statistics with Reliability, New Delhi: PHI, 1994
Levin, Richard. I., and David. S. Rubin. 1997. Statistics for Management, 7th
edition, New Jersey: Prentice Hall International.
Gupta, S.C., and V.K. Kapoor. 1997. Fundamentals of Mathematical Statistics.
9th Revised Edition, New Delhi: Sultan Chand & Sons.

278 Self-Instructional Material

Freud, J.E., and F.J. William. 1997. Elementary Business Statistics – The Statistical Inferences

Modern Approach. 3rd edition, New Jersey: Prentice Hall International.

Goon, A.M., M.K. Gupta, and B. Das Gupta. 1983. Fundamentals of Statistics.
Vols. I & II, Calcutta: The World Press Pvt. Ltd.
NOTES

Self-Instructional Material 279

UG B.sc. Mathematics 113 53 BSc-Mathematics Numerical Analysis CRC 2329
No ratings yet
UG B.sc. Mathematics 113 53 BSc-Mathematics Numerical Analysis CRC 2329
264 pages
Numerical Methods Statistical Analysis
No ratings yet
Numerical Methods Statistical Analysis
337 pages
Numericals
No ratings yet
Numericals
336 pages
Dokumen - Pub Numerical and Statistical Methods For Computer Engineering Gujarat Technological University 2017 2nbsped 9789352604852 9352604857
No ratings yet
Dokumen - Pub Numerical and Statistical Methods For Computer Engineering Gujarat Technological University 2017 2nbsped 9789352604852 9352604857
598 pages
Computer Oriented Numerical Methods (CONM) 2620004
No ratings yet
Computer Oriented Numerical Methods (CONM) 2620004
3 pages
CourseCurriculum MATH242
No ratings yet
CourseCurriculum MATH242
3 pages
NumericalMethodsforEngineers PDF
No ratings yet
NumericalMethodsforEngineers PDF
160 pages
Full SSG Ma214 Napostmidsem 201718
100% (1)
Full SSG Ma214 Napostmidsem 201718
267 pages
MA214LectureNotesFULL PDF
No ratings yet
MA214LectureNotesFULL PDF
273 pages
Numerical and Statistical Methods: For Civil Engineering
33% (3)
Numerical and Statistical Methods: For Civil Engineering
716 pages
Knowledge/Comprehension/ Application/ Analysis/ Synthesis/Evaluation)
No ratings yet
Knowledge/Comprehension/ Application/ Analysis/ Synthesis/Evaluation)
12 pages
Numerical Analysis Study Material MA 214
No ratings yet
Numerical Analysis Study Material MA 214
43 pages
MTH-231 and 1301
No ratings yet
MTH-231 and 1301
4 pages
MA214-Lecture Notes
No ratings yet
MA214-Lecture Notes
282 pages
231MAB202T-I RA-II Semester - With Corrected Blooms Taxanomy
No ratings yet
231MAB202T-I RA-II Semester - With Corrected Blooms Taxanomy
3 pages
Jnu Syllabus Mca
No ratings yet
Jnu Syllabus Mca
1 page
M2-Class Notes All Units Final
No ratings yet
M2-Class Notes All Units Final
375 pages
Probability - Statistics and Numerical Methods
No ratings yet
Probability - Statistics and Numerical Methods
3 pages
Numerical Analysis with MATLAB Guide
No ratings yet
Numerical Analysis with MATLAB Guide
76 pages
Na - 2
No ratings yet
Na - 2
314 pages
Numerical Methods Syllabus
No ratings yet
Numerical Methods Syllabus
4 pages
Fundamentals of Scientific Computations and Numerical Linear Algebra
No ratings yet
Fundamentals of Scientific Computations and Numerical Linear Algebra
318 pages
Statistics and Numerical Methods Syllabus
No ratings yet
Statistics and Numerical Methods Syllabus
2 pages
Numerical Methods for CSE Students
No ratings yet
Numerical Methods for CSE Students
10 pages
Numerical Analysis Book
No ratings yet
Numerical Analysis Book
147 pages
Numerical Methods Course Overview
No ratings yet
Numerical Methods Course Overview
6 pages
Ma8452 Syllabus-For Studies
No ratings yet
Ma8452 Syllabus-For Studies
2 pages
S&NM
No ratings yet
S&NM
2 pages
NMPS It
No ratings yet
NMPS It
4 pages
Engineering Math-III
No ratings yet
Engineering Math-III
5 pages
Sample
No ratings yet
Sample
34 pages
Statistics & Numerical Methods Course
No ratings yet
Statistics & Numerical Methods Course
2 pages
Engineering Math III
No ratings yet
Engineering Math III
5 pages
Ma SNM Notes
No ratings yet
Ma SNM Notes
168 pages
Mathematics Class 12
No ratings yet
Mathematics Class 12
3 pages
MT152 - Numerical and Statistical Computation
No ratings yet
MT152 - Numerical and Statistical Computation
1 page
Numerical Methods For Engineers
No ratings yet
Numerical Methods For Engineers
160 pages
Matheng Skript 1213
No ratings yet
Matheng Skript 1213
227 pages
(MADHU MANGAL PAUL) Numerical Analysis For Scienti
100% (1)
(MADHU MANGAL PAUL) Numerical Analysis For Scienti
666 pages
Statistics & Numerical Methods Course
No ratings yet
Statistics & Numerical Methods Course
2 pages
Mat 202
No ratings yet
Mat 202
51 pages
Course Curriculum
No ratings yet
Course Curriculum
3 pages
Algorithmic Math for Engineers
No ratings yet
Algorithmic Math for Engineers
5 pages
Numerical Method 1
No ratings yet
Numerical Method 1
3 pages
Probability Statistics and Numerical Methods
No ratings yet
Probability Statistics and Numerical Methods
4 pages
B.Tech Math Syllabus 2021-22
No ratings yet
B.Tech Math Syllabus 2021-22
7 pages
Numerical Methods and Probability Concepts
No ratings yet
Numerical Methods and Probability Concepts
1 page
Statistics and Numerical Methods Course Plan
No ratings yet
Statistics and Numerical Methods Course Plan
7 pages
Group 04
No ratings yet
Group 04
67 pages
Numerical Metods Chapter 1 & 2
100% (1)
Numerical Metods Chapter 1 & 2
35 pages
Bos - Mat 1120 - 25.2.2021
No ratings yet
Bos - Mat 1120 - 25.2.2021
8 pages
Numerical Method 4th Semester Study Notes Nepal
No ratings yet
Numerical Method 4th Semester Study Notes Nepal
185 pages
Numerical Analysis Lecture Notes
No ratings yet
Numerical Analysis Lecture Notes
245 pages
Collins Fundamental Numerical Methods
No ratings yet
Collins Fundamental Numerical Methods
255 pages
P6 Math Mid Year Mock Exam Paper
No ratings yet
P6 Math Mid Year Mock Exam Paper
21 pages
Understanding Scientific Notation Basics
No ratings yet
Understanding Scientific Notation Basics
1 page
Burger E.B. An Introduction
No ratings yet
Burger E.B. An Introduction
86 pages
Math Class 5th Worksheet - 4
No ratings yet
Math Class 5th Worksheet - 4
3 pages
Quadratic Equations Question Bank
No ratings yet
Quadratic Equations Question Bank
3 pages
SSC-I Mathematics Exam Weightage Guide
No ratings yet
SSC-I Mathematics Exam Weightage Guide
312 pages
Arithmetic Operations, Logic Operations
No ratings yet
Arithmetic Operations, Logic Operations
4 pages
Grade 5 NMP HGP Week 7
No ratings yet
Grade 5 NMP HGP Week 7
7 pages
Java Notes (II - Unit) Final
No ratings yet
Java Notes (II - Unit) Final
13 pages
Basics of C
No ratings yet
Basics of C
50 pages
C5 Approximations
No ratings yet
C5 Approximations
5 pages
Due Saturday, February 10th at 11:59 PM: Math 157: Problem Set 2
No ratings yet
Due Saturday, February 10th at 11:59 PM: Math 157: Problem Set 2
4 pages
Mathematics 3 Second Quarter Exam
No ratings yet
Mathematics 3 Second Quarter Exam
4 pages
Absolute Values - Maths 1A
No ratings yet
Absolute Values - Maths 1A
2 pages
6509447ca4fb160018af035e - ## - CSAT 22 - DPP (English) (Questions) (CSAT)
No ratings yet
6509447ca4fb160018af035e - ## - CSAT 22 - DPP (English) (Questions) (CSAT)
1 page
Algebra
No ratings yet
Algebra
64 pages
Understanding Fractions: Types and Conversions
No ratings yet
Understanding Fractions: Types and Conversions
15 pages
L6 Venn Diagrams To Find HCF and LCM
No ratings yet
L6 Venn Diagrams To Find HCF and LCM
13 pages
Factorisation Techniques for Quadratics
No ratings yet
Factorisation Techniques for Quadratics
10 pages
Quiz - 4 (Year 4) - Google Forms
No ratings yet
Quiz - 4 (Year 4) - Google Forms
6 pages
Integers Absolute Value
No ratings yet
Integers Absolute Value
68 pages
Primary 3 Mathematics
No ratings yet
Primary 3 Mathematics
24 pages
PowerPoint - Multiply A 4-Digit Number by A 2-Digit Number
No ratings yet
PowerPoint - Multiply A 4-Digit Number by A 2-Digit Number
10 pages
Number System & Logic Gate
No ratings yet
Number System & Logic Gate
9 pages
Art of Computer Programming Knuth Vol v2 3rd Edition by Knuth 9780321635761 0321635760 PDF Download
100% (4)
Art of Computer Programming Knuth Vol v2 3rd Edition by Knuth 9780321635761 0321635760 PDF Download
51 pages
On Subtraction Class 3 DAV IFFCO
No ratings yet
On Subtraction Class 3 DAV IFFCO
19 pages
كتاب الماث للصف الرابع الابتدائي ترم اول 2022
No ratings yet
كتاب الماث للصف الرابع الابتدائي ترم اول 2022
389 pages
Cambridge Lower Secondary Maths Workbook 9-Answers
62% (21)
Cambridge Lower Secondary Maths Workbook 9-Answers
3 pages
Average 19924139
No ratings yet
Average 19924139
9 pages
TC - CH 2.7 Implicit Diff Extra Practice
No ratings yet
TC - CH 2.7 Implicit Diff Extra Practice
2 pages