UNIT –III
ASSOCIATION AND RECOMMENDATION SYSTEM
Association Rule
Association rule mining finds interesting associations and relationships
among large sets of data items. This rule shows how frequently a itemset
occurs in a transaction. A typical example is a Market Based Analysis.
Market Based Analysis is one of the key techniques used by large relations
to show associations between items.It allows retailers to identify
relationships between the items that people buy together frequently.
Given a set of transactions, we can find rules that will predict the occurrence
of an item based on the occurrences of other items in the transaction.
TID Items
1 Bread, Milk
2 Bread, Diaper, Beer, Eggs
3 Milk, Diaper, Beer, Coke
4 Bread, Milk, Diaper, Beer
5 Bread, Milk, Diaper, Coke
Before we start defining the rule, let us first see the basic definitions.
Support Count(�σ) - Frequency of occurrence of a itemset.
Here �σ({Milk, Bread, Diaper})=2
Frequent Itemset - An itemset whose support is greater than or equal to
minsup threshold. Association Rule - An implication expression of the form
X -> Y, where X and Y are any 2 itemsets.
Example: {Milk, Diaper}->{Beer}
Rule Evaluation Metrics -
• Support(s) - The number of transactions that include items in the
{X} and {Y} parts of the rule as a percentage of the total number of
transaction.It is a measure of how frequently the collection of items
occur together as a percentage of all transactions.
• Support = �σ(X+Y) ÷÷ total - It is interpreted as fraction of
transactions that contain both X and Y.
• Confidence(c) - It is the ratio of the no of transactions that includes
all items in {B} as well as the no of transactions that includes all items in
{A} to the no of transactions that includes all items in {A}.
• Conf(X=>Y) = Supp(X∪∪Y) ÷÷ Supp(X) - It measures how
often each item in Y appears in transactions that contains items in X also.
• Lift(l) - The lift of the rule X=>Y is the confidence of the rule
divided by the expected confidence, assuming that the itemsets X and Y
are independent of each other.The expected confidence is the confidence
divided by the frequency of {Y}.
• Lift(X=>Y) = Conf(X=>Y) ÷÷ Supp(Y) - Lift value near 1 indicates
X and Y almost often appear together as expected, greater than 1 means
they appear together more than expected and less than 1 means they
appear less than expected.Greater lift values indicate stronger
association.
Example - From the above table, {Milk, Diaper}=>{Beer}
s= �σ({Milk, Diaper, Beer}) ÷÷ |T|
= 2/5
= 0.4
c= �σ(Milk, Diaper, Beer) ÷÷ �σ(Milk, Diaper)
= 2/3
= 0.67
l= Supp({Milk, Diaper, Beer}) ÷÷ Supp({Milk,
Diaper})*Supp({Beer})
= 0.4/(0.6*0.6)
= 1.11
The Association rule is very useful in analyzing datasets. The data is
collected using bar-code scanners in supermarkets. Such databases consists
of a large number of transaction records which list all items bought by a
customer on a single purchase. So the manager could know if certain groups
of items are consistently purchased together and use this data for adjusting
store layouts, cross-selling, promotions based on statistics.
Apriori Algorithm
Apriori Algorithm is a basic method used in data analysis to find groups of
items that often appear together in large sets of data. It helps to discover
useful patterns or rules about how items are related which is particularly
valuable in market basket analysis.
Like in a grocery store if many customers buy bread and butter together, the
store can use this information to place these items closer or create special
offers. This helps the store sell more and make customers happy.
How the Apriori Algorithm Works?
The Apriori Algorithm operates through a systematic process that involves
several key steps:
1. Identifying Frequent Itemsets
• The Apriori algorithm starts by looking through all the data to count
how many times each single item appears. These single items are called
1-itemsets.
• Next it uses a rule called minimum support this is a number that tells
us how often an item or group of items needs to appear to be important.
If an item appears often enough meaning its count is above this
minimum support it is called a frequent itemset.
2. Creating Possible item group
• After finding the single items that appear often enough (frequent 1-
item groups) the algorithm combines them to create pairs of items (2-
item groups). Then it checks which pairs are frequent by seeing if they
appear enough times in the data.
• This process keeps going step by step making groups of 3 items, then
4 items and so on. The algorithm stops when it can’t find any bigger
groups that happen often enough.
3. Removing Infrequent Item groups
• The Apriori algorithm uses a helpful rule to save time. This rule says:
if a group of items does not appear often enough then any larger group
that incl2 udes these items will also not appear often.
• Because of this, the algorithm does not check those larger groups.
This way it avoids wasting time looking at groups that won’t be
important make the whole process faster.
4. Generating Association Rules
• The algorithm makes rules to show how items are related.
• It checks these rules using support, confidence and lift to find the
strongest ones.
Key Metrics of Apriori Algorithm
• Support: This metric measures how frequently an item appears in the
dataset relative to the total number of transactions. A higher support
indicates a more significant presence of the itemset in the dataset.
Support tells us how often a particular item or combination of items
appears in all the transactions ("Bread is bought in 20% of all
transactions.")
• Confidence: Confidence assesses the likelihood that an item Y is
purchased when item X is purchased. It provides insight into the strength
of the association between two items. Confidence tells us how often
items go together. ("If bread is bought, butter is bought 75% of the
time.")
• Lift: Lift evaluates how much more likely two items are to be
purchased together compared to being purchased independently. A lift
greater than 1 suggests a strong positive association. Lift shows how
strong the connection is between items. ("Bread and butter are much
more likely to be bought together than by chance.")
• leverage measures the difference between the observed co-occurrence
frequency of items (antecedent and consequent) and their expected co-occurrence
frequency if they were statistically independent
• Leverage(A→B) = Support(A ∪ B) - (Support(A) × Support(B)).
Lets understand the concept of apriori Algorithm with the help of an
example. Consider the following dataset and we will find frequent itemsets
and generate association rules for them:
Transactions of a Grocery Shop
Step 1 : Setting the parameters
• Minimum Support Threshold: 50% (item must appear in at least
3/5 transactions). This threshold is formulated from this formula:
Support(�)=Number of transactions containing itemset �Tot
al number of transactionsSupport(A)=Total number of transactionsN
umber of transactions containing itemset A
Support(A)=Total number of transactionsNumber of transactions containi
ng itemset A
• Minimum Confidence Threshold: 70% ( You can change the
value of parameters as per the use case and problem statement
). This threshold is formulated from this formula:
Confidence(�→�)=Support(�∪
�)Support(�)Confidence(X→Y)=Support(X)Support(X∪Y)
Step 2: Find Frequent 1-Itemsets
Lets count how many transactions include each item in the dataset
(calculating the frequency of each item).
Frequent 1-Itemsets
All items have support% ≥ 50%, so they qualify as frequent 1-
itemsets. if any item has support% < 50%, It will be omitted out from
the frequent 1- itemsets.
Step 3: Generate Candidate 2-Itemsets
Combine the frequent 1-itemsets into pairs and calculate their
support. For this use case we will get 3 item pairs ( bread,butter) ,
(bread,ilk) and (butter,milk) and will calculate the support similar to
step 2
Candidate 2-Itemsets
Frequent 2-itemsets: {Bread, Milk} meet the 50% threshold but
{Butter, Milk} and {Bread ,Butter} doesn't meet the threshold, so will
be committed out.
Step 4: Generate Candidate 3-Itemsets
Combine the frequent 2-itemsets into groups of 3 and calculate their
support. for the triplet we have only got one case i.e
{bread,butter,milk} and we will calculate the support.
Candidate 3-Itemsets
Since this does not meet the 50% threshold, there are no frequent
3-itemsets.
Step 5: Generate Association Rules
Now we generate rules from the frequent itemsets and calculate
confidence.
Rule 1: If Bread → Butter (if customer buys bread, the
customer will buy butter also)
• Support of {Bread, Butter} = 2.
• Support of {Bread} = 4.
• Confidence = 2/4 = 50% (Failed threshold).
Rule 2: If Butter → Bread (if customer buys butter, the
customer will buy bread also)
• Support of {Bread, Butter} = 3.
• Support of {Butter} = 3.
• Confidence = 3/3 = 100% (Passes threshold).
Rule 3: If Bread → Milk (if customer buys bread, the
customer will buy milk also)
• Support of {Bread, Milk} = 3.
• Support of {Bread} = 4.
• Confidence = 3/4 = 75% (Passes threshold).
The Apriori Algorithm, as demonstrated in the bread-butter example,
is widely used in modern startups like Zomato, Swiggy and other
food delivery platforms. These companies use it to perform market
basket analysis which helps them identify customer behaviour
patterns and optimise recommendations.
Applications of Apriori Algorithm
Below are some applications of Apriori algorithm used in today's
companies and startups
1. E-commerce: Used to recommend products that are often
bought together like laptop + laptop bag, increasing sales.
2. Food Delivery Services: Identifies popular combos such as
burger + fries, to offer combo deals to customers.
3. Streaming Services: Recommends related movies or shows
based on what users often watch together like action + superhero
movies.
4. Financial Services: Analyzes spending habits to suggest
personalised offers such as credit card deals based on frequent
purchases.
5. Travel & Hospitality: Creates travel packages like flight +
hotel by finding commonly purchased services together.
6. Health & Fitness: Suggests workout plans or supplements
based on users' past activities like protein shakes + workouts
APPLICATIONS OF ASSOCIATION RULES:
Association rules in big data analytics are utilized to discover
interesting relationships and patterns within massive datasets. Their
applications span various industries and functions:
1. Market Basket Analysis:
• Identifying products frequently purchased together in retail
settings (e.g., "customers who buy bread also buy milk").
• Optimizing store layouts, product placements, and promotional
strategies.
• Informing cross-selling and up-selling initiatives.
2. Recommendation Systems:
• Generating personalized product recommendations for e-
commerce platforms based on user purchase history and browsing
behavior.
• Suggesting content (movies, music, articles) to users based on
their past consumption and preferences of similar users.
3. Customer Analytics and Segmentation:
• Understanding customer buying patterns and preferences to
segment customers into distinct groups.
• Developing targeted marketing campaigns and personalized
offers based on identified associations.
4. Fraud Detection:
• Identifying unusual patterns or combinations of transactions
that may indicate fraudulent activity in financial services or insurance.
• Detecting anomalies in network security logs to identify
potential cyber threats.
5. Healthcare:
• Analyzing patient data to identify co-occurring medical
conditions, potential risk factors for diseases, and effective treatment
pathways.
• Discovering associations between symptoms and diagnoses to
aid in medical research and clinical decision-making.
6. Web Usage Mining:
• Analyzing user navigation patterns on websites to understand
user behavior and optimize website design and content.
• Identifying frequently visited pages or common clickstream
sequences.
7. Text Mining:
• Discovering relationships between words, phrases, or topics in
large text corpora (e.g., academic papers, news articles).
• Identifying co-occurring keywords or themes in documents for
information retrieval and summarization.
8. Inventory Management:
• Optimizing inventory levels by identifying products with strong
co-occurrence patterns, ensuring that complementary items are
adequately stocked.
9. Quality Control and Manufacturing:
• Identifying associations between manufacturing parameters
and product defects to improve production processes and product
quality.
What are Recommender Systems?
•
There are so many choices that people often feel trapped, whether they're
trying to choose a movie to watch, the right product to buy, or new music to
listen to. To solve this problem, recommendation systems comes into play
that help people find their way through all of these choices by giving them
unique ideas based on their likes and dislikes.
In this tutorial, we will understand the concept of Recommendation
Systems, it's methodologies, and importance.
Understanding Recommendation Systems
Recommendation systems, often known as recommender systems, are a
type of information filtering system that attempts to forecast the "rating" or
"preference" that a user would assign to an item. They are common in
today's digital scene, serving an important role in online shopping,
streaming services, social networking, and other platforms where
personalization and user experience are critical.
Algorithms in recommendation systems evaluate user data, such as prior
purchases, reviews, or browsing history, to find trends and preferences
to utilize this information for recommending goods that are likely to interest
the user.
Examples Of Recommendation Systems:
• Online e-commerce model such as Amazon recommend goods based
on your browsing and purchase history.
• Music streaming services like Spotify, propose songs and artists
based on your listening history.
• Podcast streaming providers such as Netflix recommend movies and
TV series based on your watching history.
Types of Recommendation Systems
There are mainly three methodologies for Recommendation Systems:
collaborative filtering, content-based filtering, and hybrid systems.
Method 1. Collaborative filtering
How collaborative filtering works
Collaborative filtering uses a matrix to map user behavior for each item in its system.
The system then draws values from this matrix to plot as data points in a vector space.
Various metrics then measure the distance between points as a means of calculating
user-user and item-item similarity.
User-item matrix
In a standard setting of collaborative filtering, we have a set of n users and a set
of x items. Each user’s individual preference for each item is displayed in a user-item
matrix (sometimes called a user rating matrix). Here, users are represented in rows
and items in columns. In the Rij matrix, a given value represents the behavior of
user u toward item i. These values may be continuous numbers provided by users (for
example ratings) or binary values that signify whether a given user viewed or
purchased the item. Here is an example user-time matrix for a bookshop website:
This matrix displays user ratings for different books available. A collaborative
filtering algorithm compares user’s provided ratings for each book. By identifying
similar users or items based on those ratings, it predicts ratings for books a target user
has not seen—represented by null in the matrix—and recommend (or not recommend)
those books to the target user according.
The example matrix used here is full given it's restricted to four users and four items.
However, in real world scenarios known users’ preferences for items are often
limited, leaving the user-item matrix sparse.3
Similarity measures
How does a collaborative recommendation algorithm determine similarity between
various users? As mentioned, proximity in vector space is a primary method. But the
specific metrics used to determine that proximity may vary. Two such metrics are
cosine similarity and Pearson correlation coefficient.
Cosine similarity
Cosine similarity signifies the measurement of the angle between two vectors.
Compared vectors comprise a subset of ratings for given user or item. The cosine
similarity score can be any value between -1 and 1. The higher the cosine score, the
more alike two items are considered. Some sources recommend this metric for high-
dimensional feature spaces. In collaborative filtering, vector points are pulled directly
from the user-item matrix. Cosine similarity is represented by this formula,
where x and y signify two vectors in vector space:
Pearson correlation coefficient (PCC)
PCC helps measure similarity between items or users by computing the correlation
between two users’ or items’ respective ratings. PCC ranges between -1 and 1, which
signify negative to identical correlation. Unlike cosine similarity, PCC uses all the
ratings for a given user or item. For example, if calculating PCC between two users,
we use this formula, in which a and b are different users, and rai and rbi are that user's
rating for item i:5
Types of collaborative recommender systems
There are two primary types of collaborative filtering systems: memory-based and
model-based.
Memory-based
Memory-based recommender systems, or neighbor-based systems, are extensions
of k-nearest neighbors classifiers because they attempt to predict a target user’s
behavior toward a given item based on similar users or set of items. Memory-based
systems can be divided into two sub-types:
• User-based filtering recommends items to a target user based on the
preferences of behaving users. The recommendation algorithm compares a
target user’s past behavior to other users. Specifically, the system assigns each
user a weight representing their perceived similarity with the target user—this
is the target user’s neighbors. It then selects n users with the highest weights
and computes a prediction of the target user’s behavior (e.g. movie rating,
purchase, dislikes, etc.) from a weighted average of the selected neighbors’
behavior. The system then recommends items to the target user based on this
prediction. The principle is that, if the target user behaved similarly to this
group in the past, they will behave similarly with unseen items. User-based
similarity functions are computed between rows in the user-item matrix.6
• Item-based filtering recommends new items to a target user based on that
user’s behavior toward similar items. Note, however, that in comparing items,
the collaborative system does not compare item features (as in content-based
filtering) but instead how users interact with those items. For instance, in a
movie recommendation system, the algorithm may identify similar movies
based on correlations between all user ratings for each movie (correcting for
each user’s average rating). The system will then recommend a new movie to
a target user based on correlated ratings. That is, if the target user rated
movie a and b highly but has not seen movie c, and other users who rated the
former two highly also rated movie c highly, the system will recommend
movie c to the target user. In this way, item-based filtering calculates item
similarity through user behavior. Item-based similarity functions are computed
between columns in the user-item matrix.7
Model-based
At times, literature describes memory-based methods as instance-based learning
methods. This points to how user and item-based filtering make predictions specific to
a given instance of user-item interaction, such as a target user’s rating for an unseen
movie.
By contrast, model-based methods create a predictive machine learning model of the
data. The model uses present values in the user-item matrix as the training dataset and
produces predictions for missing values with the resultant model. Model-based
methods thus use data science techniques and machine learning algorithms such
as decision trees, Bayes classifiers, and neural networks to recommend items to
users.8
Matrix factorization is a widely discussed collaborative filtering method often
classified as a type of latent factor model. As a latent factor model, matrix
factorization assumes user-user or item-item similarity can be determined through a
select number of features. For instance, a user’s book rating may be predicted using
only book genre and user age or gender. This lower-dimensional representation
thereby aims to explain, for example, book ratings by characterizing items and users
according to a few select features pulled from user feedback data.9 Because it reduces
the features of a given vector space, matrix factorization also serves as
a dimensionality reduction method.
Collaborative filtering operates by evaluating user interactions and
determining similarities between people (user-based) and things (item-
based). For example, if User A and User B like the same movies, User A
may love other movies that User B enjoys. A method used in
recommendation systems to forecast items which user may enjoy based on
the preferences of other users who have similar likes. It works by analyzing
user interactions and identifying similarities between individuals (user-
based) and objects (item-based).
1.1 User-based Collaborative Filtering
This technique predicts products that a user could appreciate based on
ratings provided to that item by other users who share the target user's
preferences. The steps are as follows:
• Finding similarities between users and the target user: This is
determined using an algorithm that considers the ratings provided by
both users to common goods.
• Predict the missing rating of an item: The ratings that come from
users who are more like you are given more weight than the ratings that
come from users who are less like you. This is accomplished using
a weighted average method.
1.2 Item-Based Collaborative Filtering
This method predicts which things a user would enjoy based on their
similarity. The steps are as follows:
• Item to item similarity: The similarity of all item pairings is
determined, often using cosine similarity.
• Prediction Computation: A rating is generated using the items that
the user has previously rated and are most comparable to the missing
item. This is accomplished using a method that calculates the rating for a
specific item based on a weighted sum of the ratings of other comparable
goods.
Both user-based and item-based collaborative filtering may work on the
same data, the choice is determined by the recommendation system's unique
needs.
Method 2. Content-based filtering
Content-based filtering is a technique used in recommender systems to
suggest items that are comparable with an item a user has shown interest in,
based on the item's attributes. It uses machine learning algorithms to classify
similar items based on inherent characteristics such as genres, directors, or
keywords associated with previously seen movies. This strategy is
especially effective for enterprises that provide a variety of goods, services,
or information since it may make individualized suggestions to consumers
based on their previous behavior or explicit input. If a user has given high
ratings to action movies, the algorithm will propose more action movies
based on genres, directors, or keywords connected with previously loved
movies.
• Content-based filtering involves representing items and users in a
feature space, which may contain categories, publishers, and other
relevant properties.
• The similarity between user and item is then determined using
statistic metric dot product, which reflects how many features are active
in both vectors at a moment. A high dot product suggests more common
features, resulting in a higher similarity.
Content-based filtering are implemented using classification models and
the vector spacing method. The classification strategy makes use of
machine learning models such as decision trees, whilst the vector spacing
method makes recommendations based on the distance between the user and
item vectors.
One of the primary benefits of content-based filtering is that it does not
rely on data from other users to create suggestions, making it especially
effective for people with specific taste or items with low user interaction
data. However, it may be limited by the quality of the item features and the
algorithm's ability to capture the intricacies of human preferences.
3.Knowledge Recommender System
Method 3. Hybrid systems
Hybrid systems in recommendation systems combine collaborative and
content-based methods to leverage the strengths of each approach,
resulting in more accurate and diversified recommendations. These systems
often start with content-based filtering to study new users and gradually
integrate collaborative filtering as more interaction data becomes available.
Hybrid recommender systems can be categorized into weighted, feature
combination, cascade, feature augmentation, meta-level, switching, and
mixed models. The feature combination method interprets collaborative
information as additional feature associated with each example and applies
content-based approaches to this enriched data collection. The meta-level
hybrid recommender system combines two recommender systems such that
the outputy of one becomes the input for the other.
Hybrid recommender systems are the most effective approach to developing
a recommender system. However, they do have drawbacks, such as the
ramp-up problem, since both systems need a database of ratings.
Knowledge-based and utility-based recommender strategies . The most
popular hybrid recommender systems are feature augmentation and meta-
level systems, which feed information from one into the output of the other.
How Recommendation System Works?
Recommender systems operate by filtering and predicting user preferences
using sophisticated algorithms and extensive data analysis. The basic
mechanics of recommender systems includes several critical elements:
• User profiles are built using both explicit data, such as ratings and
reviews, and implicit data, including browsing history and click habits.
• Item profiles provide information about the objects, such as genre,
actors, and movie keywords.
The recommendation algorithms then examine these profiles using
methods such as matrix factorization, which breaks down user-item
interactions into latent elements, or deep learning models, which detect
complicated patterns in big datasets. These algorithms estimate what things
a user would favor and rank them appropriately.
Deep Neural Network Models for Recommendation Systems
Deep learning has transformed the models of recommender systems by
developing sophisticated models capable of capturing complex patterns in
user behavior and item features. Some of the most common deep learning
models used for recommendation are:
• Autoencoders: Autoencoders are neural networks that learn to
represent input efficiently. In recommender systems, autoencoders are
used to rebuild user-item interaction matrices. The objective is to
compress user preferences into a smaller latent space, and then recover
the original user preferences from this compressed representation. The
network's encoder reduces the data's dimensionality, while the decoder
reconstructs it.
• Deep Neural Networks (DNNs): Multiple layers of interconnected
neurons are used in DNNs. The input data is transformed into a higher-
level representation by each layer, enabling the network to capture
complex patterns. The intricate relationships between users and items are
modeled by DNNs by considering various features such as user
demographics, item attributes, and historical interactions. The likelihood
of a user interacting with an item is predicted by using these models.
• Convolutional Neural Network (CNNs): Image and video
processing are primarily performed using CNNs. Images, videos, or any
content where spatial or temporal patterns are important can be
recommended by applying CNNs. High-level features from visual
content are extracted by CNNs to help recommend similar items based
on its similarity.
• Recurrent Neural Networks (RNNs): RNNs works with sequential
data where the output at each step depends on the previous input.
Session-based recommendations, where the order of interactions matters,
are ideal for RNNs. Temporal dependencies in user behavior are
modeled to provide recommendations based on the sequence of actions
taken by a user.
• Attention Mechanisms: Models are allowed to focus on the most
relevant parts of the input data by attention mechanisms. Different parts
of the input are dynamically weighed, highlighting the most important
features. In recommendation systems, features or interactions that most
influence a user’s preferences are identified and prioritized by attention
mechanisms. More accurate predictions are made by the model by
concentrating on the crucial parts of the input.
Importance of Recommendation Systems
Recommender systems are an essential component of current digital
platforms, helping to improve user experiences, drive engagement, and
provide decision-making tools. These systems serve as information filtering
tools, providing users with tailored material or information that is relevant to
their taste and interests.
Recommender systems have become essential for organizations since they
can significantly boost income by making tailored suggestions that result in
improved sales.
• Faster Decision-making: Recommender systems increase user
tendency to purchase suggested things, boost loyalty and overall
happiness, lower transaction costs, and improve decision-making process
and quality.
• Personalized user experience: Making highly relevant and valuable
suggestions, recommender systems improve the user experience.
• Increase engagement: Recommendation systems help users interact
with a system by providing them material, goods, or services that they
are likely to be interested in.
Conclusion
To summarize, recommendation systems are an essential component of
current digital platforms, playing an important role in improving user
experiences, increasing engagement, and offering decision-making tools.
These systems utilize complex algorithms and extensive data analysis to
deliver personalized recommendations that adapt to specific user tastes,
increasing the chance of purchase, increasing loyalty and overall pleasure,
and enhancing decision-making.
What is collaborative filtering?
Collaborative filtering is a type of recommender system. It groups users based on
similar behavior, recommending new items according to group characteristics.
Collaborative filtering is an information retrieval method that recommends items to
users based on how other users with similar preferences and behavior have interacted
with that item. In other words, collaborative filtering algorithms group users based on
behavior and use general group characteristics to recommend items to a target user.
Collaborative recommender systems operate on the principle that similar users
(behavior-wise) share similar interests and similar tastes.1
Collaborative filtering vs content-based filtering
Collaborative filtering is one of two primary types of recommender systems, the other
being content-based recommenders. This latter method uses item features to
recommend similar items as the items with which a particular user has positively
interacted in the past.2 While collaborative filtering focuses on user similarity to
recommend items, content-based filtering recommends items exclusively according to
item profile features. Content-based filtering targets recommendations to one specific
How collaborative filtering works
Collaborative filtering uses a matrix to map user behavior for each item in its system.
The system then draws values from this matrix to plot as data points in a vector space.
Various metrics then measure the distance between points as a means of calculating
user-user and item-item similarity.
User-item matrix
In a standard setting of collaborative filtering, we have a set of n users and a set
of x items. Each user’s individual preference for each item is displayed in a user-item
matrix (sometimes called a user rating matrix). Here, users are represented in rows
and items in columns. In the Rij matrix, a given value represents the behavior of
user u toward item i. These values may be continuous numbers provided by users (for
example ratings) or binary values that signify whether a given user viewed or
purchased the item. Here is an example user-time matrix for a bookshop website:
This matrix displays user ratings for different books available. A collaborative
filtering algorithm compares user’s provided ratings for each book. By identifying
similar users or items based on those ratings, it predicts ratings for books a target user
has not seen—represented by null in the matrix—and recommend (or not recommend)
those books to the target user according.
The example matrix used here is full given it's restricted to four users and four items.
However, in real world scenarios known users’ preferences for items are often
limited, leaving the user-item matrix sparse.3
Similarity measures
How does a collaborative recommendation algorithm determine similarity between
various users? As mentioned, proximity in vector space is a primary method. But the
specific metrics used to determine that proximity may vary. Two such metrics are
cosine similarity and Pearson correlation coefficient.
Cosine similarity
Cosine similarity signifies the measurement of the angle between two vectors.
Compared vectors comprise a subset of ratings for given user or item. The cosine
similarity score can be any value between -1 and 1. The higher the cosine score, the
more alike two items are considered. Some sources recommend this metric for high-
dimensional feature spaces. In collaborative filtering, vector points are pulled directly
from the user-item matrix. Cosine similarity is represented by this formula,
where x and y signify two vectors in vector space:4
Pearson correlation coefficient (PCC)
PCC helps measure similarity between items or users by computing the correlation
between two users’ or items’ respective ratings. PCC ranges between -1 and 1, which
signify negative to identical correlation. Unlike cosine similarity, PCC uses all the
ratings for a given user or item. For example, if calculating PCC between two users,
we use this formula, in which a and b are different users, and rai and rbi are that user's
rating for item i:5
Types of collaborative recommender systems
There are two primary types of collaborative filtering systems:
1.memory-based and
2.model-based.
Memory-based
Memory-based recommender systems, or neighbor-based systems, are extensions
of k-nearest neighbors classifiers because they attempt to predict a target user’s
behavior toward a given item based on similar users or set of items. Memory-based
systems can be divided into two sub-types:
• User-based filtering recommends items to a target user based on the
preferences of behaving users. The recommendation algorithm compares a
target user’s past behavior to other users. Specifically, the system assigns each
user a weight representing their perceived similarity with the target user—this
is the target user’s neighbors. It then selects n users with the highest weights
and computes a prediction of the target user’s behavior (e.g. movie rating,
purchase, dislikes, etc.) from a weighted average of the selected neighbors’
behavior. The system then recommends items to the target user based on this
prediction. The principle is that, if the target user behaved similarly to this
group in the past, they will behave similarly with unseen items. User-based
similarity functions are computed between rows in the user-item matrix.6
• Item-based filtering recommends new items to a target user based on that
user’s behavior toward similar items. Note, however, that in comparing items,
the collaborative system does not compare item features (as in content-based
filtering) but instead how users interact with those items. For instance, in a
movie recommendation system, the algorithm may identify similar movies
based on correlations between all user ratings for each movie (correcting for
each user’s average rating). The system will then recommend a new movie to
a target user based on correlated ratings. That is, if the target user rated
movie a and b highly but has not seen movie c, and other users who rated the
former two highly also rated movie c highly, the system will recommend
movie c to the target user. In this way, item-based filtering calculates item
similarity through user behavior. Item-based similarity functions are computed
between columns in the user-item matrix.7
Model-based
At times, literature describes memory-based methods as instance-based learning
methods. This points to how user and item-based filtering make predictions specific to
a given instance of user-item interaction, such as a target user’s rating for an unseen
movie.
By contrast, model-based methods create a predictive machine learning model of the
data. The model uses present values in the user-item matrix as the training dataset and
produces predictions for missing values with the resultant model. Model-based
methods thus use data science techniques and machine learning algorithms such
as decision trees, Bayes classifiers, and neural networks to recommend items to
users.8
Matrix factorization is a widely discussed collaborative filtering method often
classified as a type of latent factor model. As a latent factor model, matrix
factorization assumes user-user or item-item similarity can be determined through a
select number of features. For instance, a user’s book rating may be predicted using
only book genre and user age or gender. This lower-dimensional representation
thereby aims to explain, for example, book ratings by characterizing items and users
according to a few select features pulled from user feedback data.9 Because it reduces
the features of a given vector space, matrix factorization also serves as
a dimensionality reduction method.10
Advantages and disadvantages of collaborative filtering
Advantages
Compared to content-based systems, collaborative filtering is more effective at
providing users with novel recommendations. Collaborative-based methods draw
recommendations from a pool of users who share interests with one target user. For
instance, if a user group liked the same set of items as the target user, but also liked an
additional item unknown to the target user because it shares no features with the
previous set of items, a collaborative filtering system recommends this novel item to
the user. Collaborative filtering can recommend items that a target user may have not
considered but that nevertheless appeal to their user type.11
Disadvantages
The cold start problem is perhaps the most widely cited disadvantage of collaborative
filtering systems. It occurs when a new user (or even a new item) enters the system.
That user’s lack of item-interaction history prevents the system from being able to
evaluate the new user’s similarity or association with existing users. By contrast,
content-based systems are more adept at handling new items, although they also
struggle with recommendations for new users.12
DATA SPARSITY
Data sparsity is another chief problem that can plague collaborative recommendation
systems. As mentioned, recommender systems typically lack data on user preferences
for most items in the system. This means that most of the system’s feature space is
empty, a condition called data sparsity. As data sparsity increases, vector points
become so dissimilar that predictive models become less effective at identifying
explanatory patterns.13 This is a primary reason why matrix factorization—and related
latent factor methods such as singular value decomposition—is popular in
collaborative filtering, as it alleviates data sparsity by reducing features. Other
methods implemented for resolving this issue may also involve users themselves
assessing and providing information on their own interests, which the system can then
use to filter recommendations.
Recent research
While past studies have approached recommendation as a prediction or classification
problem, a substantive body of recent research argues that it is understood as a
sequential, decision-making problem. In this paradigm, reinforcement learning might
be more suitable for addressing recommendation. This approach argues that
recommendation updates in real-time according to user-item interaction; as the user
skips, clicks, rates, purchases suggested items, the model develops an optimal policy
from this feedback to recommend new items.14 Recent studies propose a wide variety
of reinforcement learning applications to address mutable, long-term user interests,
which pose challenges for both content-based and collaborative filtering.15
Data sparsity refers to having insufficient user-item interaction data, while the cold
start problem occurs when there's a lack of information on new users or items, making
it hard for recommender systems to provide accurate recommendations. These are
related challenges, with sparsity being the general lack of data and cold start being a
specific instance of that lack of data when new entities are introduced into the
system.
Data Sparsity
• Definition:
A situation where the available data in a recommender system is sparse, meaning it's
irregular, insufficient, or highly varied. This is common because users typically only
rate a small percentage of the available items.
• Impact:
Leads to difficulty in identifying reliable similar users or items, as there isn't enough
interaction data to form accurate patterns.
• Solutions:
• Matrix Factorization: Techniques like Singular Value
Decomposition (SVD) can reduce data dimensionality and overcome sparsity by
creating latent feature vectors for users and items.
• Hybrid Models: Combining different recommendation techniques or
incorporating side information like contextual data to improve the understanding of
user preferences and reduce reliance on sparse ratings.
Cold Start Problem
• Definition:
A challenge where a system struggles to generate recommendations due to
insufficient information for new users or new items.
• Types:
• User Cold Start: A new user joins the system, and there is no
historical data or ratings to understand their preferences.
• Item Cold Start: A new item is added to the system, and it has no
ratings or views, making it difficult to recommend to anyone.
• Impact:
The system cannot connect new users or items to existing patterns, resulting in poor
or no recommendations for them.
• Solutions:
• Side Information: Utilizing information beyond user-item
interactions, such as user demographics, item attributes, or social connections, to
make initial inferences.
• Transfer Learning: Transferring knowledge from existing, well-
understood users or items to new ones to help with predictions.
• Content-Based Filtering: Relying on item features or content to
provide recommendations for new items or users with similar feature preferences,
even without interaction data.
•
Content Based Recommender System
•
Content-Based Recommender Systems focus on the characteristics
of items and the preferences of users to generate personalized
recommendations.
It uses information about a user’s past behavior and item features
to recommend similar items.
This can include explicit feedback such as ratings or even implicit
feedback like clicks, views or time spent on content. Based on this
data, the system generates a user profile, which is then used to
find items that closely match the user's preferences. As time
passes, the user continues to interact with the system which results
in more accurate and relevant suggestions.
Ex:Netflix
Components of a Content-Based Recommender
1. User Profile
The User Profile is a representation of the user’s preferences. We model it
as a feature vector, capturing characteristics of items the user liked or
interacted with.
• A utility matrix is used to represent interactions between users and
items (ratings, clicks, likes).
• The system analyzes items previously rated or liked by the user to
identify key features like genre, actors, etc.
• These features are then aggregated to form the user's profile vector.
Example: If a user likes action movies directed by Christopher Nolan and
starring Christian Bale, their profile may have high weights for the features
“action” “Christopher Nolan” and “Christian Bale”
2. Item Profile
Each item is also represented as a vector of relevant features. Key features
depend on the domain:
• Movies: genre, director, actors, release year, IMDb rating.
• Books: author, genre, publication year, keywords.
• Products: brand, category, specifications, price.
The item profile captures the essence of what the item is about. This
information is later compared with the user profile to measure similarity.
3. Utility Matrix
The Utility Matrix represents the preferences of users for different items.
Each row corresponds to a user and each column corresponds to an item.
The matrix can be partially filled, as users rarely rate or interact with all
available items.
User / The Dark The
Movie Inception Knight Interstellar Notebook
User A 5 4 5 1
User B 4 5 ? ?
Some of the columns are blank in the matrix that is because we don’t get the
whole input from the user every time, and the goal of a recommendation
system is not to fill all the columns but to recommend a movie to the user
which they will prefer.
Making Recommendations
Once the user and item profiles are created the system must determine how
well each item aligns with the user's preferences. Two common approaches
are:
Method 1: Cosine Similarity
Cosine similarity is used to measure the angle between the user vector and
the item vector. The smaller the angle (closer to 0), the higher the similarity.
How it works:
• A user vector might include positive weights for preferred features
(e.g., genres, actors) and negative weights for disliked features.
• The cosine of the angle between vectors indicates how closely an
item matches the user’s taste.
Formula:
∥ ∥ ∥∥ ⋅
Cosine Similarity= u × i u i
Where:
• � u is the user profile vector
• � i is the item profile vector
A higher similarity score indicates a better match and such items are
recommended to the user.
Method 2: Classification-Based Approach
Instead of calculating similarities, we can treat recommendation as a
classification problem, predicting whether a user will like or dislike
an item.
Example classifier: Decision Tree
• Features: Genre, director, actors, duration, etc.
• At each node, the tree asks a question like "Is the genre
Action?"
• The tree refines decisions at each level until it predicts a
binary outcome: like or dislike.
This approach can be extended using:
• Logistic Regression
• Random Forests
• Support Vector Machines (SVMs)
It works well when we have labeled data and want more interpretable rules
for recommendations.
Advantages of Content-Based Recommendation
• Personalized: Gives suggestions based on individual preferences.
• No cold start for items: Since recommendations depend on item
features, new items can be recommended if their features are known.
• User independence: Unlike collaborative filtering, content-based
methods don’t need data from other users.
Content-Based Recommender Systems offer a way to suggest items by
analyzing user behavior and item attributes. They come with limitations but
their effectiveness can be increased when combined with other techniques in
hybrid models.
Knowledge-based recommender systems
Knowledge based recommenders are a specific type of recommender
system that are based on explicit knowledge about the item assortment, user
preferences, and recommendation criteria (i.e., which item should be
recommended in which context). These systems are applied in scenarios
where alternative approaches such as collaborative filtering and content-
based filtering cannot be applied.
A major strength of knowledge-based recommender systems is the non-
existence of cold start (ramp-up) problems. A corresponding drawback is a
potential knowledge acquisition bottleneck triggered by the need to define
recommendation knowledge in an explicit fashion.
Item domains
Knowledge-based recommender systems are well suited to complex domains
where items are not purchased very often, such as apartments and cars.
Further examples of item domains relevant for knowledge-based
recommender systems are financial services, digital cameras, and tourist
destinations.
Rating-based systems often do not perform well in these domains due to the
low number of available ratings
Additionally, in complex item domains, customers want to specify their
preferences explicitly (e.g., "the maximum price of the car is X") . In this
context, the recommender system must take into account constraints: for
instance, only those financial services that support the investment period
specified by the customer should be recommended. Neither of these aspects
are supported by approaches such as collaborative filtering and content-
based filtering.
Conversational recommendation
Knowledge-based recommender systems are often conversational, i.e., user
requirements and preferences are elicited within the scope of a feedback
loop. A major reason for the conversational nature of knowledge-based
recommender systems is the complexity of the item domain where it is often
impossible to articulate all user preferences at once. Furthermore, user
preferences are typically not known exactly at the beginning but are
constructed within the scope of a recommendation session.[6]
Search-based recommendation
In a search-based recommender, user feedback is given in terms of answers
to questions which restrict the set of relevant items.[7] An example of such a
question is "Which type of lens system do you prefer: fixed or exchangeable
lenses?". On the technical level, search-based recommendation scenarios
can be implemented on the basis of constraint-based recommender
systems.[7] Constraint-based recommender systems are implemented on the
basis of constraint search [7][8] or different types of conjunctive query-based
approaches.[9]
Navigation-based recommendation
In a navigation-based recommender, user feedback is typically provided in
terms of "critiques" [10] which specify change requests regarding the item
currently recommended to the user.
Critiques are then used for the recommendation of the next "candidate" item.
An example of a critique in the context of a digital camera recommendation
scenario is "I would like to have a camera like this but with a lower price". This
is an example of a "unit critique" which represents a change request on a
single item attribute.
"Compound critiques" [4] allow the specification of more than one change
request at a time.
"Dynamic critiquing" [11] also takes into account preceding user critiques (the
critiquing history). More recent approaches additionally exploit information
stored in user interaction logs to further reduce the interaction effort in terms
of the number of needed critiquing cycles.[12][13][14][15] [16]
Hybrid Recommendation Systems
Hybrid recommendation systems combine collaborative and
content-based filtering techniques to capitalize on the advantages of
both approaches. Hybrid systems aim to provide more accurate and
personalized recommendations by combining these methods,
overcoming the limitations of individual techniques.
Ex.Netflix.The website makes recommendation by comparing the
watching and searching habits of similar users(ie collaborative
filtering)as well as by offering movies that share characteristics with
films tat a user has rated highly(content based filtering)
HYBRIDIZATION DESIGNS
Advantage of Hybrid Recommendation System
Hybrid recommendation systems use both collaborative and
content-based filtering to improve accuracy and overcome the cold
start problem. These systems provide more accurate and diverse
recommendations by leveraging user interactions and item
attributes, catering to a variety of user preferences while also
addressing the challenges of insufficient data for new users or
items.
Challenges faced in Hybrid Recommendation System
Integrating collaborative and content-based filtering necessitates
careful planning to ensure coordination. The challenges include
managing data sparsity and scalability, as collaborative filtering is
based on sparse user interactions, and maintaining and tuning the
system, which is complex and resource-intensive, necessitating
continuous monitoring and adjustment