0% found this document useful (0 votes)
11 views2 pages

Feed-Forward Network Functions

The document introduces feed-forward network functions in machine learning, detailing the stages involved in constructing a neural network, including linear combinations of inputs, activation of hidden units, and output unit activations. It emphasizes the importance of adjustable weights and biases during training to optimize performance for regression or classification tasks. The choice of activation function is crucial and varies based on the data and target variable distribution.

Uploaded by

p.hemanthsai125
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views2 pages

Feed-Forward Network Functions

The document introduces feed-forward network functions in machine learning, detailing the stages involved in constructing a neural network, including linear combinations of inputs, activation of hidden units, and output unit activations. It emphasizes the importance of adjustable weights and biases during training to optimize performance for regression or classification tasks. The choice of activation function is crucial and varies based on the data and target variable distribution.

Uploaded by

p.hemanthsai125
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

INTRODUCTION TO MACHINE LEARNING - 20EC6404C

Topic 3.1: Feed-forward Network Functions


Unit - III
Faculty: Turimerla Pratap

Feed-forward Network Functions


The linear models for regression and classification are based on linear combinations of fixed non-
linear basis functions ϕj (x). The goal is to extend this model by making the basis functions ϕj (x)
depend on parameters and allowing these parameters to be adjusted during training. This kind of
architectures thus formed are called feed-forward network functions.

The feed-forward network consists of the following stages:

Linear combinations of inputs: We construct M linear combinations of the input variables


x1 , . . . , xD using weights and biases:


D
(1) (1)
aj = wji xi + wj0 , for j = 1, . . . , M.
i=1

Activation of hidden units: The linear combinations aj are then transformed using a differen-
tiable, nonlinear activation function h(·) to obtain hidden unit activations:

zj = h(aj ), for j = 1, . . . , M.

1
Output unit activations: The hidden unit activations zj are linearly combined using weights and
biases to obtain output unit activations:


M
(2) (2)
ak = wkj zj + wk0 , for k = 1, . . . , K.
j=1

Final network outputs: The output unit activations ak are transformed using an appropriate acti-
vation function to obtain the network outputs yk :

yk = f (ak ), where f (·) is the chosen activation function.

The choice of activation function depends on the data and the assumed distribution of the target
variable. These equations represent the basic structure of a neural network, where the hidden units
and output units are formed through linear combinations of inputs and nonlinear transformations.
The weights and biases are adjustable parameters that are learned during training to optimize the
network’s performance in regression or classification tasks.

For standard regression problems, the activation function is the identity, so that yk = ak .
Suppose if the activation function is sigmoid, for multiple binary classification problems, each
output unit activation is transformed using a logistic sigmoid function, so that

yk = σ(ak ),

where
1
σ(a) =
1 + exp(−a)
We can combine these various stages to give the overall network function that, for sigmoidal
output unit activation functions, takes the form
(M ( D ) )
∑ (2) ∑ (1) (1) (2)
yk (x, w) = σ wkj h wji xi + wj0 + wk0
j=1 i=1

where the set of all weight and bias parameters have been grouped together into a vector w. Thus,
the neural network model is simply a nonlinear function from a set of input variables xi to a set of
output variables yk controlled by a vector w of adjustable parameters.

You might also like