100% found this document useful (1 vote)

228 views33 pages

Clustering Techniques Guide

This document summarizes different clustering methods: - Partitioning methods like K-means and K-medoids group data into a set number of partitions to optimize an objective function and minimize distances between objects and cluster centroids/medoids. - Hierarchical methods create nested clusters by merging or dividing clusters. Density-based methods form clusters based on the density of data points. - The document then provides details on K-means, K-medoids, and the sampling-based CLARA algorithm for clustering large datasets. It discusses the principles, algorithms, properties and applications of these clustering techniques.

Uploaded by

prabhudeen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

228 views33 pages

Clustering Techniques Guide

Uploaded by

prabhudeen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

Chapter 3: Cluster Analysis

3.1 Basic Concepts of Clustering

3.1.1 Cluster Analysis
3.1.2 Clustering Categories
3.2 Partitioning Methods
3.2.1 The principle
3.2.2 K-Means Method
3.2.3 K-Medoids Method
3.2.4 CLARA
3.2.5 CLARANS

3.3 Hierarchical Methods

3.4 Density-based Methods

3.5 Clustering High-Dimensional Data

3.6 Outlier Analysis

3.1.1 Cluster Analysis

Unsupervised learning (i.e., Class label is unknown)

Group data to form new categories (i.e., clusters), e.g., cluster

houses to find distribution patterns

Principle: Maximizing intra-class similarity & minimizing interclass

similarity

Typical Applications
WWW, Social networks, Marketing, Biology, Library, etc.

3.1.2 Clustering Categories

Partitioning Methods

Hierarchical Methods

Hypothesize a model for each cluster and find the best fit of the
data to the given model

Clustering high-dimensional data

Quantize the object space into a finite number of cells

Model-based methods

Grow a given cluster depending on its density (# data objects)

Grid-based Methods

Creates a hierarchical decomposition of the data

Density-based Methods

Construct k partitions of the data

Subspace clustering

Constraint-based methods

Used for user-specific applications

Chapter 3: Cluster Analysis

3.1 Basic Concepts of Clustering

3.1.1 Cluster Analysis
3.1.2 Clustering Categories
3.2 Partitioning Methods
3.2.1 The principle
3.2.2 K-Means Method
3.2.3 K-Medoids Method
3.2.4 CLARA
3.2.5 CLARANS

3.3 Hierarchical Methods

3.4 Density-based Methods

3.5 Clustering High-Dimensional Data

3.6 Outlier Analysis

3.2.1 Partitioning Methods: The Principle

Given

A data set of n objects

K the number of clusters to form

Organize the objects into k partitions (k<=n) where each partition

represents a cluster

The clusters are formed to optimize an objective partitioning

criterion

Objects within a cluster are similar

Objects of different clusters are dissimilar

3.2.2 K-Means Method

Goal:
create 3 clusters
(partitions)

Choose 3 objects
(cluster centroids)
Assign each object
to the closest centroid
to form Clusters

+
Update cluster
centroids

+
+

K-Means Method

Recompute
Clusters

+
+
+

If Stable centroids,
then stop

+
+
+

K-Means Algorithm
`

Input

K: the number of clusters

D: a data set containing n objects

Output: A set of k clusters

Method:
(1) Arbitrary choose k objects from D as in initial cluster centers
(2) Repeat
(3) Reassign each object to the most similar cluster based on the
mean value of the objects in the cluster
(4) Update the cluster means
(5) Until no change

K-Means Properties
`

The algorithm attempts to determine k partitions that minimize the

square-error function

( p m i)

E: the sum of the squared error for all objects in the data set

P: the data point in the space representing an object

mi: is the mean of cluster Ci

It works well when the clusters are compact clouds that are
rather well separated from one another

K-Means Properties
Advantages
`

K-means is relatively scalable and efficient in processing large

data sets

The computational complexity of the algorithm is O(nkt)

n: the total number of objects

k: the number of clusters

t: the number of iterations

Normally: k<<n and t<<n

Disadvantage
`

Can be applied only when the mean of a cluster is defined

Users need to specify k

K-means is not suitable for discovering clusters with nonconvex

shapes or clusters of very different size

It is sensitive to noise and outlier data points (can influence the

mean value)

Variations of the K-Means Method

A few variants of the k-means which differ in

Selection of the initial k means

Dissimilarity calculations

Strategies to calculate cluster means

Handling categorical data: k-modes (Huang98)

Replacing means of clusters with modes

Using new dissimilarity measures to deal with categorical objects

Using a frequency-based method to update modes of clusters

A mixture of categorical and numerical data

November 2, 2010

Data Mining: Concepts and Techniques

3.2.3 K-Medoids Method

Minimize the sensitivity of k-means to outliers

Pick actual objects to represent clusters instead of mean values

Each remaining object is clustered with the representative object

(Medoid) to which is the most similar

The algorithm minimizes the sum of the dissimilarities between

each object and its corresponding reference point

| p o

E: the sum of absolute error for all objects in the data set

P: the data point in the space representing an object

Oi: is the representative object of cluster Ci

K-Medoids Method: The Idea

Initial representatives are chosen randomly

The iterative process of replacing representative objects by no

representative objects continues as long as the quality of the
clustering is improved

For each representative Object O

For each non-representative object R, swap O and R

Choose the configuration with the lowest cost

Cost function is the difference in absolute error-value if a current

representative object is replaced by a non-representative object

K-Medoids Method: Example

Data Objects

9
8

7
6

O10

4
10

5
4

7
5

Goal: create two clusters

Choose randmly two medoids
O2 = (3,4)
O8 = (7,4)

K-Medoids Method: Example

Data Objects

9
cluster1

A1
O1

7
6

O10

cluster2

3
4

5
4

7
5

Assign each object to the closest representative object

Using L1 Metric (Manhattan), we form the following
clusters
Cluster1 = {O1, O2, O3, O4}
Cluster2 = {O5, O6, O7, O8, O9, O10}

K-Medoids Method: Example

Data Objects

cluster1

A1
O1

O10

cluster2

3
4

5
4

7
5

Compute the absolute error criterion [for the set of

Medoids (O2,O8)]

| p o |
i1 pCi

| o1 o2 | | o3 o2 | | o4 o2 |

| o5 o8 | | o6 o8 | | o7 o8 | | o9 o8 | | o10 o8 |

K-Medoids Method: Example

Data Objects

cluster1

A1
O1

O10

cluster2

3
4

5
4

7
5

The absolute error criterion [for the set of Medoids

(O2,O8)]

(3 4 4) (3 1 1 2 2)

K-Medoids Method: Example

Data Objects

cluster1

A1
O1

O10

cluster2

3
4

7
5

Choose a random object O7

Swap O8 and O7
Compute the absolute error criterion [for the set of Medoids
(O2,O7)]

(3 4 4) (2 2 1 3 3)

K-Medoids Method: Example

Data Objects

9
cluster1

A1
O1

7
6

O10

cluster2

3
4

5
4

7
5

Compute the cost function

Absolute error [for O2,O7] Absolute error [O2,O8]

22 20

S> 0 it is a bad idea to replace O8 by O7

K-Medoids Method
9

Data Objects

cluster1

A1
O1

O10

cluster2

3
4

5
4

7
5

In this example, changing the medoid of

cluster 2 did not change the assignments of
objects to clusters.

What are the possible cases when we

replace a medoid by another object?

K-Medoids Method
Cluster 1

Cluster 2

First case

B
B

The assignment of P to A does not change

Representative object
Random Object
Currently P assigned to A
Cluster 1
A

Cluster 2

Second case

p
B

Representative object
Random Object
Currently P assigned to B

P is reassigned to A

K-Medoids Method
Cluster 1
A

Cluster 2

Third case

p
B
B

P is reassigned to the new B

Representative object
Random Object
Currently P assigned to B
Cluster 1

Cluster 2

Fourth case

A
B
p

Representative object
Random Object
Currently P assigned to A

P is reassigned to B

K-Medoids Algorithm(PAM)
PAM : Partitioning Around Medoids
`

`
`

Input
K: the number of clusters
D: a data set containing n objects
Output: A set of k clusters
Method:
(1) Arbitrary choose k objects from D as representative objects (seeds)
(2) Repeat
(3) Assign each remaining object to the cluster with the nearest
representative object
(4) For each representative object Oj
(5) Randomly select a non representative object Orandom
(6) Compute the total cost S of swapping representative object Oj with
Orandom
(7) if S<0 then replace Oj with Orandom

(8) Until no change

K-Medoids Properties(k-medoids vs.K-means)

The complexity of each iteration is O(k(n-k)2)

For large values of n and k, such computation becomes very costly

Advantages

K-Medoids method is more robust than k-Means in the presence of

noise and outliers

Disadvantages

K-Medoids is more costly that the k-Means method

Like k-means, k-medoids requires the user to specify k

It does not scale well for large data sets

3.2.4 CLARA
`

CLARA (Clustering Large Applications) uses a sampling-based

method to deal with large data sets

A random sample

should closely represent the

original data
`

The chosen medoids

will likely be similar

to what would have been
chosen from the whole
data set

PAM
sample

CLARA
`

Draw multiple samples of the data set

Apply PAM to each sample

Choose the best clustering

Clusters

PAM

Return the best clustering

sample1

sample2

samplem

CLARA Properties
`

Complexity of each Iteration is: O(ks2 + k(n-k))

s: the size of the sample
k: number of clusters
n: number of objects

PAM finds the best k medoids among a given data, and CLARA
finds the best k medoids among the selected samples

Problems
The best k medoids may not be selected during the sampling
process, in this case, CLARA will never find the best clustering
If the sampling is biased we cannot have a good clustering
Trade off-of efficiency

3.2.5 CLARANS
`

CLARANS (Clustering Large Applications based upon

RANdomized Search ) was proposed to improve the quality and
the scalability of CLARA

It combines sampling techniques with PAM

It does not confine itself to any sample at a given time

It draws a sample with some randomness in each step of the

CLARANS: The idea

Clustering view

Current
medoids

Cost=10

Cost=5

Cost=2

Cost=1

Cost=3

Keep the current medoids

medoids

Cost=20

Cost=5

CLARANS: The idea

CLARA
` Draws a sample of nodes at the beginning of the search
` Neighbors are from the chosen sample
` Restricts the search to a specific area of the original data

Sample

medoids
First step of the search
Neighbors are from the chosen sample

second step of the search

Neighbors are from the chosen sample

Current
medoids

CLARANS: The idea

CLARANS
` Does not confine the search to a localized area
` Stops the search when a local minimum is found
` Finds several local optimums and output the clustering
with the best local optimum
First step of the search
Draw a random sample of neighbors

Original data

medoids

Current
medoids

second step of the search

Draw a random sample of neighbors

The number of neighbors sampled from the original data is specified by the user

CLARANS Properties
`

Advantages

Experiments show that CLARANS is more effective than both PAM

and CLARA

Handles outliers

Disadvantages

The computational complexity of CLARANS is O(n2), where n is the

number of objects

The clustering quality depends on the sampling method

Summary of Section 3.2

Partitioning methods find sphere-shaped clusters

K- mean is efficient for large data sets but sensitive to outliers

PAM uses centers of the clusters instead of means

CLARA and CLARANS are used for clustering large databases

Lect3 Clustering
No ratings yet
Lect3 Clustering
86 pages
Clustering Techniques in CMPUT 466
No ratings yet
Clustering Techniques in CMPUT 466
34 pages
Clustering Techniques and Their Applications in Engineering
100% (1)
Clustering Techniques and Their Applications in Engineering
16 pages
Business Research Clustering Guide
No ratings yet
Business Research Clustering Guide
2 pages
Clustering Methods in Data Science
No ratings yet
Clustering Methods in Data Science
8 pages
Clustering Analysis
No ratings yet
Clustering Analysis
102 pages
LDA in Python: Machine Learning Lab
No ratings yet
LDA in Python: Machine Learning Lab
12 pages
Clustering Algorithms for Large Data
No ratings yet
Clustering Algorithms for Large Data
13 pages
Lecture 17 Clustering
No ratings yet
Lecture 17 Clustering
63 pages
cp4252 Machine Learning Lab Manual
No ratings yet
cp4252 Machine Learning Lab Manual
38 pages
Machine Learning Quiz for Students
No ratings yet
Machine Learning Quiz for Students
8 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
33 pages
Clustering Techniques Overview
No ratings yet
Clustering Techniques Overview
47 pages
Top 75 Leet Code Questions To Save You Time
No ratings yet
Top 75 Leet Code Questions To Save You Time
4 pages
Clustering - K-Means: Prerequisite
No ratings yet
Clustering - K-Means: Prerequisite
8 pages
Constraint-Based Cluster Analysis Overview
No ratings yet
Constraint-Based Cluster Analysis Overview
56 pages
C++ Interview Questions: Class
No ratings yet
C++ Interview Questions: Class
14 pages
Linux Interview Prep Guide
No ratings yet
Linux Interview Prep Guide
8 pages
Stoichiometry
No ratings yet
Stoichiometry
45 pages
Clustering Techniques Overview
No ratings yet
Clustering Techniques Overview
26 pages
Important LeetCode Questions
No ratings yet
Important LeetCode Questions
2 pages
Previous Year Question Paper of NIS
No ratings yet
Previous Year Question Paper of NIS
123 pages
Clustering
No ratings yet
Clustering
11 pages
Unit 5 - Cluster Analysis
No ratings yet
Unit 5 - Cluster Analysis
14 pages
CPP Interview Questions
No ratings yet
CPP Interview Questions
7 pages
Machine Learning Strategies
No ratings yet
Machine Learning Strategies
59 pages
PSMIReal Mode-MikhailLapshin
No ratings yet
PSMIReal Mode-MikhailLapshin
42 pages
Unit 5 - Theory of Computation - WWW - Rgpvnotes.in
No ratings yet
Unit 5 - Theory of Computation - WWW - Rgpvnotes.in
17 pages
Clustering Techniques Overview
No ratings yet
Clustering Techniques Overview
29 pages
C Programming Exercises and Solutions
No ratings yet
C Programming Exercises and Solutions
124 pages
Data Analytics for B.Tech Students
No ratings yet
Data Analytics for B.Tech Students
98 pages
Chemistry Exam for High Schoolers
No ratings yet
Chemistry Exam for High Schoolers
4 pages
1 Atoms Molecules Stoiciometry
No ratings yet
1 Atoms Molecules Stoiciometry
11 pages
CHM01 CO4 LESSON1 Stoichiometry
No ratings yet
CHM01 CO4 LESSON1 Stoichiometry
16 pages
Interview Questions Ans Answers
No ratings yet
Interview Questions Ans Answers
4 pages
K-Means Clustering Insights
No ratings yet
K-Means Clustering Insights
8 pages
ML Unit-2
No ratings yet
ML Unit-2
26 pages
Understanding Cluster Analysis Basics
No ratings yet
Understanding Cluster Analysis Basics
51 pages
C Lab Worksheet 11A - 1 C & C++ Pointers Part 3: Pointers, Array and Functions
No ratings yet
C Lab Worksheet 11A - 1 C & C++ Pointers Part 3: Pointers, Array and Functions
5 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
35 pages
Php-Notes Module 1
No ratings yet
Php-Notes Module 1
129 pages
Handout - BITS-F464 - Machine - Learning - August 2019
No ratings yet
Handout - BITS-F464 - Machine - Learning - August 2019
4 pages
3.popular Machine Learning Algorithm
No ratings yet
3.popular Machine Learning Algorithm
11 pages
Data Structures & C++ Interview Prep
No ratings yet
Data Structures & C++ Interview Prep
99 pages
C Interview Questions Answers
No ratings yet
C Interview Questions Answers
19 pages
ResNet & VGGNet Deep Learning Guide
No ratings yet
ResNet & VGGNet Deep Learning Guide
44 pages
Pointers Due To Which C Is A Unique Language
No ratings yet
Pointers Due To Which C Is A Unique Language
39 pages
Deepanshu Machine Learning
100% (1)
Deepanshu Machine Learning
108 pages
SQL Interview Questions
No ratings yet
SQL Interview Questions
2 pages
K-Means Clustering in Data Mining
No ratings yet
K-Means Clustering in Data Mining
8 pages
C/C++ Interview Questions and Answers
No ratings yet
C/C++ Interview Questions and Answers
14 pages
How To Set Linux Process Priority Using Nice and Renice Commands
No ratings yet
How To Set Linux Process Priority Using Nice and Renice Commands
19 pages
Phy Vol - 1 Formulas With Solved Obj Problems - F
No ratings yet
Phy Vol - 1 Formulas With Solved Obj Problems - F
625 pages
Unsupervised Learning - Clustering
No ratings yet
Unsupervised Learning - Clustering
19 pages
Cluster Analysis Techniques Overview
No ratings yet
Cluster Analysis Techniques Overview
33 pages
2002 Spring CS525 Lecture 2
No ratings yet
2002 Spring CS525 Lecture 2
37 pages
Partitioning-Based Clustering Overview
No ratings yet
Partitioning-Based Clustering Overview
27 pages
Clustering
No ratings yet
Clustering
25 pages
Data Mining-Partitioning Methods
100% (1)
Data Mining-Partitioning Methods
7 pages
Overview of Partitioning Clustering Methods
No ratings yet
Overview of Partitioning Clustering Methods
26 pages
BARC Admit Card 2020
No ratings yet
BARC Admit Card 2020
2 pages
Scholarship Management System
No ratings yet
Scholarship Management System
1 page
Database Normalization Quiz
No ratings yet
Database Normalization Quiz
4 pages
Jio Infosec Contact and Resources
No ratings yet
Jio Infosec Contact and Resources
1 page
Painter's Algorithm & BSP Trees
No ratings yet
Painter's Algorithm & BSP Trees
12 pages
UP Science Talent Search Examination 2016-2017
No ratings yet
UP Science Talent Search Examination 2016-2017
3 pages
Lect15 PDF
No ratings yet
Lect15 PDF
14 pages
Lect11 PDF
No ratings yet
Lect11 PDF
27 pages
Lect23 PDF
No ratings yet
Lect23 PDF
18 pages
Lect07 PDF
No ratings yet
Lect07 PDF
18 pages
Fast Fourier Transform Overview
No ratings yet
Fast Fourier Transform Overview
59 pages
BTech Signals & Systems Exam
No ratings yet
BTech Signals & Systems Exam
2 pages
Digital Signal Processing Guide
0% (1)
Digital Signal Processing Guide
11 pages
MATLAB Control Systems Guide
No ratings yet
MATLAB Control Systems Guide
10 pages
Process Control Strategies Guide
No ratings yet
Process Control Strategies Guide
2 pages
Learner'S Module: Mathematics in The Modern World
No ratings yet
Learner'S Module: Mathematics in The Modern World
5 pages
Remez
No ratings yet
Remez
40 pages
Decision Tree
No ratings yet
Decision Tree
9 pages
Inequalities and Contradictions in Math
No ratings yet
Inequalities and Contradictions in Math
2 pages
Ms. Mehroz Sadiq: 11/23/2020 Bahria University Islamabad 1
No ratings yet
Ms. Mehroz Sadiq: 11/23/2020 Bahria University Islamabad 1
75 pages
Module 3 Topic 3 Lesson 2B Weighted Graphs PDF
No ratings yet
Module 3 Topic 3 Lesson 2B Weighted Graphs PDF
14 pages
Lecture-02: PGDDS 202
No ratings yet
Lecture-02: PGDDS 202
15 pages
Ig FP Chapter1 Exercise2
No ratings yet
Ig FP Chapter1 Exercise2
5 pages
A Basic Guide To The AC Specifications of ADCs
No ratings yet
A Basic Guide To The AC Specifications of ADCs
28 pages
HW 3
No ratings yet
HW 3
2 pages
Final Quiz 1 - Attempt Review 3
No ratings yet
Final Quiz 1 - Attempt Review 3
3 pages
ParPgmDesign Fosterrr
No ratings yet
ParPgmDesign Fosterrr
33 pages
Coordination of Overcurrent Relays in Distribution System Using Linear Programming Technique
100% (1)
Coordination of Overcurrent Relays in Distribution System Using Linear Programming Technique
4 pages
Cepstrum: Origin and Definition
No ratings yet
Cepstrum: Origin and Definition
4 pages
Chapte-2 Information Theory and Coding
No ratings yet
Chapte-2 Information Theory and Coding
68 pages
System Partitioning
100% (1)
System Partitioning
3 pages
TW-SL Assignments Students
No ratings yet
TW-SL Assignments Students
3 pages
DS Jan 2023 SPPU Question Paper Solution
No ratings yet
DS Jan 2023 SPPU Question Paper Solution
41 pages
Algorithms and Data Structures Exercises
No ratings yet
Algorithms and Data Structures Exercises
3 pages
Thesis Help for Sigma Delta DAC
100% (2)
Thesis Help for Sigma Delta DAC
8 pages
Structural Breaks in Path Dependent Models
No ratings yet
Structural Breaks in Path Dependent Models
45 pages
Digital Logic Assignment Solutions
No ratings yet
Digital Logic Assignment Solutions
3 pages
CS3401 - Algorithm Solved Answers For University Questions.
No ratings yet
CS3401 - Algorithm Solved Answers For University Questions.
39 pages
Algorithmic Trading & Quantitative Strategies Gappy Lecture 5
No ratings yet
Algorithmic Trading & Quantitative Strategies Gappy Lecture 5
22 pages
HW Sol
No ratings yet
HW Sol
3 pages

Clustering Techniques Guide

Uploaded by

Clustering Techniques Guide

Uploaded by

Chapter 3: Cluster Analysis

3.1 Basic Concepts of Clustering

3.3 Hierarchical Methods

3.4 Density-based Methods

3.5 Clustering High-Dimensional Data

3.6 Outlier Analysis

3.1.1 Cluster Analysis

Unsupervised learning (i.e., Class label is unknown)

Group data to form new categories (i.e., clusters), e.g., cluster

Principle: Maximizing intra-class similarity & minimizing interclass

3.1.2 Clustering Categories

Clustering high-dimensional data

Quantize the object space into a finite number of cells

Grow a given cluster depending on its density (# data objects)

Creates a hierarchical decomposition of the data

Construct k partitions of the data

Used for user-specific applications

Chapter 3: Cluster Analysis

3.1 Basic Concepts of Clustering

3.3 Hierarchical Methods

3.4 Density-based Methods

3.5 Clustering High-Dimensional Data

3.6 Outlier Analysis

3.2.1 Partitioning Methods: The Principle

A data set of n objects

Organize the objects into k partitions (k<=n) where each partition

The clusters are formed to optimize an objective partitioning

Objects within a cluster are similar

3.2.2 K-Means Method

K: the number of clusters

Output: A set of k clusters

The algorithm attempts to determine k partitions that minimize the

P: the data point in the space representing an object

mi: is the mean of cluster Ci

K-means is relatively scalable and efficient in processing large

The computational complexity of the algorithm is O(nkt)

n: the total number of objects

k: the number of clusters

t: the number of iterations

Normally: k<<n and t<<n

Can be applied only when the mean of a cluster is defined

Users need to specify k

K-means is not suitable for discovering clusters with nonconvex

It is sensitive to noise and outlier data points (can influence the

Variations of the K-Means Method

A few variants of the k-means which differ in

Selection of the initial k means

Strategies to calculate cluster means

Handling categorical data: k-modes (Huang98)

Replacing means of clusters with modes

Using new dissimilarity measures to deal with categorical objects

Using a frequency-based method to update modes of clusters

A mixture of categorical and numerical data

Data Mining: Concepts and Techniques

3.2.3 K-Medoids Method

Minimize the sensitivity of k-means to outliers

Pick actual objects to represent clusters instead of mean values

Each remaining object is clustered with the representative object

The algorithm minimizes the sum of the dissimilarities between

P: the data point in the space representing an object

Oi: is the representative object of cluster Ci

K-Medoids Method: The Idea

Initial representatives are chosen randomly

The iterative process of replacing representative objects by no

For each representative Object O

For each non-representative object R, swap O and R

Choose the configuration with the lowest cost

Cost function is the difference in absolute error-value if a current

K-Medoids Method: Example

Goal: create two clusters

K-Medoids Method: Example

Assign each object to the closest representative object

K-Medoids Method: Example

Compute the absolute error criterion [for the set of

| o1 o2 |  | o3 o2 |  | o4 o2 |

K-Medoids Method: Example

Assign each object to the closest representative object

Compute the absolute error criterion [for the set of

| o1 o2 | | o3 o2 | | o4 o2 |

The absolute error criterion [for the set of Medoids

Choose a random object O7

Compute the cost function