Meta Learning
Meta Learning
CHUNYANG WANG, YANMIN ZHU, HAOBING LIU, TIANZI ZANG, JIADI YU, FEILONG
TANG, Shanghai Jiao Tong University, China
Deep neural network based recommendation systems have achieved great success as information filtering
arXiv:2206.04415v1 [cs.IR] 9 Jun 2022
techniques in recent years. However, since model training from scratch requires sufficient data, deep learning-
based recommendation methods still face the bottlenecks of insufficient data and computational inefficiency.
Meta-learning, as an emerging paradigm that learns to improve the learning efficiency and generalization
ability of algorithms, has shown its strength in tackling the data sparsity issue. Recently, a growing number of
studies on deep meta-learning based recommenddation systems have emerged for improving the performance
under recommendation scenarios where available data is limited, e.g. user cold-start and item cold-start.
Therefore, this survey provides a timely and comprehensive overview of current deep meta-learning based
recommendation methods. Specifically, we propose a taxonomy to discuss existing methods according to
recommendation scenarios, meta-learning techniques, and meta-knowledge representations, which could
provide the design space for meta-learning based recommendation methods. For each recommendation
scenario, we further discuss technical details about how existing methods apply meta-learning to improve
the generalization ability of recommendation models. Finally, we also point out several limitations in current
research and highlight some promising directions for future research in this area.
CCS Concepts: • Information systems → Recommender systems.
Additional Key Words and Phrases: Recommendation Systems; Meta-learning; Learning-to-Learn; Survey;
Cold-start; Few-shot Learning
ACM Reference Format:
Chunyang Wang, Yanmin Zhu, Haobing Liu, Tianzi Zang, Jiadi Yu, Feilong Tang . 2018. Deep Meta-learning
in Recommendation Systems: A Survey. J. ACM 37, 4, Article 111 (August 2018), 40 pages. https://doi.org/10.
1145/1122445.1122456
1 INTRODUCTION
In recent years, recommendation systems working as filtering systems for alleviating information
overload have been widely applied in various online applications including e-commence, entertain-
ment services, news, and so on. By presenting personalized suggestions among a large number of
candidates, recommendation systems have achieved great success in improving user experience and
increasing the attractiveness of online platforms. With the development of data-driven machine
learning algorithms [3, 90], especially deep learning based methods [9, 32, 121], academic and
industrial research in this field has greatly improved the performance of recommendation systems
in terms of accuracy, diversity, interpretability, and so on.
Due to expressive representation learning abilities to discover hidden dependencies from sufficient
data, deep learning based methods have been largely introduced in contemporary recommendation
Author’s address: Chunyang Wang, Yanmin Zhu, Haobing Liu, Tianzi Zang, Jiadi Yu, Feilong Tang, [email protected],
[email protected],[email protected],[email protected],[email protected],[email protected], Shanghai
Jiao Tong University, Shanghai, China.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee
provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and 111
the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored.
Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires
prior specific permission and/or a fee. Request permissions from [email protected].
© 2018 Association for Computing Machinery.
0004-5411/2018/8-ART111 $15.00
https://doi.org/10.1145/1122445.1122456
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
111:2 Chunyang Wang, Yanmin Zhu, Haobing Liu, Tianzi Zang, Jiadi Yu, Feilong Tang
models [26, 121]. By leveraging a great number of training instances with diverse data structures
(e.g., interaction pairs [121], sequences[20], and graphs [26]), recommendation models with deep
neural architectures are usually designed to effectively capture nonlinear and nontrivial user/item
relationships. However, conventional deep learning based recommendation models are usually
trained from scratch with sufficient data based on predefined learning algorithms. For instance,
the regular supervised learning paradigm typically trains a unified recommendation model with
interactions collected from all users and performs recommendation over unseen interactions based
on learned feature representations. Such deep learning based methods are usually data-hungry and
computation-hungry. In other words, the performance of deep learning based recommendation
systems heavily relies on the availability of a great amount of training data and sufficient com-
putation. In practical recommendation applications, data collection mainly originates from users’
interactions observed during their visits to online platforms. There exist recommendation scenarios
where available user interaction data is sparse (e.g. cold-start recommendation) and computation
for model training is restrained (e.g. online recommendation). Consequently, both data insufficiency
and computation inefficiency issues bottleneck deep learning based recommendation models.
Recently, meta-learning provides an appealing learning paradigm that focuses on strengthen-
ing the generalization ability of machine learning methods against the insufficiency of data and
computation [36, 98]. The key idea of meta-learning is to gain prior knowledge (named meta-
knowledge) about efficient task learning from previous learning processes of multiple tasks. Then,
the meta-knowledge could help facilitate fast learning over new tasks, which is supposed to have
good generalization performance on unseen tasks. Here, a task usually refers to a set of instances
belonging to the same class or having the same property, involving an individual learning process
on it. Different from improving the representation learning capacity of deep learning models,
meta-learning focuses on learning better learning strategies to substitute for fixed learning algo-
rithms, known as the concept of learn to learn. Due to its great potential for fast adaptation over
unseen tasks, meta-learning techniques have been applied in a wide range of research domains
including image recognition [4, 130], image segmentation [60], natural language processing [48],
reinforcement learning [75, 103] and so on.
The benefits of meta-learning are well-aligned with the need of promoting recommendation
models over scenarios suffering from limited instances and inefficient computation. Early efforts
on meta-learning based recommendation methods mainly fall into personalized recommendation
algorithm selection [13, 78], which extracts meta dataset features and selects suitable recom-
mendation algorithms for different datasets (or tasks). Though applying the idea of extracting
meta-knowledge and generating task-specific models, this definition of meta-learning is closer
to studies in automated machine learning [39, 115]. Afterward, deep meta-learning [38] or neu-
ral network meta-learning [36] emerges and gradually become the mainstream of meta-learning
techniques typically discussed in the recommendation models [47, 69]. As introduced in [36, 38],
Deep Meta-Learning aims to extract meta-knowledge to allow for fast learning of deep neural
networks, which brings enhancement to the currently popular deep learning paradigm. Since
2017, deep meta-learning has gained attention in the research community of recommendation
systems. Advanced meta-learning techniques are firstly applied to alleviate data insufficiency (i.e.,
cold-start issue) when training conventional deep recommendation models. For example, the most
successful optimization-based meta-learning framework MAML which learns meta-knowledge
in the form of parameter initialization of neural networks firstly shows great effectiveness in the
cold-start recommendation scenario [47]. Besides that, diverse recommendation scenarios such as
click-through-rate prediction [69], online recommendation [123], and sequential recommendation
[125] are also studied under the meta-learning schema, to improve the learning ability in the setting
of data insufficiency and computation inefficiency.
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
Deep Meta-learning in Recommendation Systems: A Survey 111:3
In this paper, we provide a timely and comprehensive survey of the rapidly growing studies
of deep meta-learning based recommendation systems. As we investigated, although there have
been some surveys on meta-learning or deep meta-learning that summarize details of general meta-
learning methods and their applications [36, 38, 98], there still lacks attention to recent advances
in recommendation systems. In addition, there are several surveys on meta-learning methods in
other application domains, such as Natural Language Processing [48, 117], Multimodality [61]
and Image Segmentation [60]. However, no previous survey centers on the deep meta-learning
in recommendation systems. Compared with them, our survey is the first attempt to fill the gap,
providing a systematic review of up-to-date papers on the combination of meta-learning and
recommendation systems.
In our survey, we aim to thoroughly review the literature on the deep meta-learning based recom-
mendation systems, which can benefit readers and researchers for a comprehensive understanding
of this topic. To carefully position works in this field, we provide a taxonomy with three perspectives
including recommendation scenarios, meta-learning techniques, and meta-knowledge representa-
tions. Moreover, we mainly discuss related methods according to recommendation scenarios and
present how different works utilize meta-learning techniques to extract specific meta-knowledge
with diverse forms such as parameter initialization, parameter modulation, hyperparameter op-
timization, .etc. We hope our taxonomy could provide a design space for developing new deep
meta-learning based recommendation methods. In addition, we also summarize common ways for
meta-learning task construction which is a necessary setup of the meta-learning paradigm.
The structure of this survey is organized as follows. In Section 2, we introduce the common
foundations of meta-learning techniques and typical recommendation scenarios in which meta-
learning methods have been studied to alleviate data insufficiency and computation inefficiency. In
Section 3, we present our taxonomy consisting of three independent axes. In Section 4, we summarize
different ways of meta-learning recommendation task construction used in the literature. Then
we elaborate on methodological details of existing methods applying meta-learning techniques
in different recommendation scenarios in Section 5. Finally, we discuss promising directions for
future research in this field in Section 6 and conclude this survey in Section 7.
Paper Collection. We summarize over 50 high-quality papers which are highly related to
deep meta-learning based recommendation systems. We carefully retrieve these papers using
Google Scholar and DBLP as main search engines with major keywords including meta-learning +
recommendation, meta + recommendation, meta + CTR, meta + recommender, etc. We particularly
pay attention to top-tier conferences and journals including KDD, SIGIR, WWW, AAAI, IJCAI,
WSDM, CIKM, ICDM, TKDE, TKDD, TOIS, so as to ensure that high-profile papers are covered.
2 FOUNDATIONS
In this section, we present the necessary foundations for discussing deep meta-learning based
recommendation methods. Firstly, we summarize the core ideas and representative works of
different categories of meta-learning techniques. Afterward, we introduce typical recommendation
scenarios in which meta-learning techniques have been studied and applied.
2.1 Meta-learning
To comprehensively understand the concept of meta-learning, we first formalize the paradigm of
meta-learning and contrast the conventional machine learning paradigm with the meta-learning
paradigm in detail. Then, we briefly present three mainstreams of meta-learning techniques, includ-
ing metric-based, model-based and optimization-based meta-learning techniques by summarizing
their core ideas and introducing several typical related works. For convenience, we list some general
symbols and their descriptions in Table 1.
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
111:4 Chunyang Wang, Yanmin Zhu, Haobing Liu, Tianzi Zang, Jiadi Yu, Feilong Tang
Notations Descriptions
𝑢𝑖 User 𝑖
𝑣𝑗 Item 𝑗
𝑟𝑢𝑖 ,𝑣 𝑗 An interaction between 𝑢𝑖 and 𝑣 𝑗 (explicit rating or implicit feedback)
𝒙𝑘 , 𝑦𝑘 Representation and label of 𝑘-th instance (e.g. an interaction)
T𝑖 𝑖-th recommendation task
S𝑖 Support set of a task T𝑖
Q𝑖 Query set of a task T𝑖
D 𝑡𝑟𝑎𝑖𝑛 Meta-training dataset
D 𝑡𝑒𝑠𝑡 Meta-testing dataset
𝑓𝜃 Base recommendation model/function
𝜃 Parameters of the base recommendation model
𝜃 T𝑖 Task-specific parameters of a personalized model for T𝑖
𝛼 Local update rate in the optimization-based meta-learning
𝛽 Global update rate in the optimization-based meta-learning
L (𝑓𝜃 , ∗) Loss function of the base recommendation model over a given dataset
F𝜔 Meta-learner parameterized with 𝜔
𝜔 Meta-knowledge obtained with the meta-learner
𝑓𝜃 T𝑖 is supposed to perform well by investigating the empirical loss L (𝑓𝜃 T𝑖 , Q𝑖 ) or other evaluation
metrics in different settings. To be mentioned, learning tasks in other schemes such as reinforcement
learning [1] and unsupervised learning [63] are also studied.
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
Deep Meta-learning in Recommendation Systems: A Survey 111:5
In the training processes of different meta-training tasks in D 𝑡𝑟𝑎𝑖𝑛 , even if the form of the
mapping functions could be the same, how to learn task-specific models is still distinct and guided
by learnable settings about task learning. For example, approximating using neural networks with
the same structure requires suitable hyper-parameters or initialization settings which are likely to
be different for different tasks. In other words, the learning of each task T𝑖 also depends on how to
learn, which is defined as meta-knowledge 𝜔 under the meta-learning paradigm. Therefore, the
task-specific learning of T𝑖 could be formalized as follows:
𝜃 T𝑖 = ℎ𝜔 (𝑓𝜃 , T𝑖 , L) (2)
where ℎ𝜔 (∗) denotes the meta-learning approches of utilizing meta-knowledge to ensure effective
learning of task T𝑖 with the same mapping function 𝑓𝜃 and loss function L.
Instead of assuming the meta-knowledge 𝜔 is pre-defined and fixed for all tasks, meta-learning
allows for learning 𝜔 to enable each task to be learned better. Manually searching in the whole
meta-knowledge space is impractical in most cases. The goal of meta-learning is to learn the optimal
𝜔 which could be utilized to guide task-specific learning of all tasks to perform better. Formally,
given all training tasks D 𝑡𝑟𝑎𝑖𝑛 = {T𝑖 }𝑖=
𝑀 , the optimal meta-knowledge 𝜔 ∗ are obtained as follows:
1
∑︁ ∑︁
∗
𝜔 = arg min L (𝑓𝜃 T𝑖 , Q𝑖 ) = arg min L (𝑓ℎ𝜔 (𝑓𝜃 ,T𝑖 ,L) , Q𝑖 ) (3)
𝝎 𝝎
T𝑖 ∈ D 𝑡𝑟𝑎𝑖𝑛 T𝑖 ∈D 𝑡𝑟𝑎𝑖𝑛
where the objective of training meta-learning methods is to observe better performance (e.g., lower
empirical loss) over the corresponding query set Q𝑖 of each task. Note that, the meta-knowledge is
learned across multiple tasks since it is supposed to mine across-task characteristics of different
task learning processes and has great generalization ability against the task differences.
In contrast with conventional machine learning, (e.g., regular supervised learning paradigm), the
meta-learning paradigm mainly has the following properties: 1) Learning objective. The learning
objective of meta-learning, i.e., meta-optimization objecive, is to facilitate the learning over unseen
tasks, while conventional machine learning aims to facilitate the learning over unseen instances of
the same task. 2) Setup of task division. For regular supervised machine learning, all instances
are usually sampled from the data distribution of a single task. There are also multi-task learning
[6] or transfer learning frameworks [109] which consider knowledge transfer across multiple tasks.
However, these frameworks mainly consider a pair of tasks or a small number of known tasks, and
transfer knowledge from other tasks as additional information, such as pretraining techniques or
joint optimization strategies. In comparison, under the meta-learning paradigm, a larger number of
tasks with relatively fewer instances are explicitly split according to specific properties (e.g., classes,
attributes, or time), so as to extract prior knowledge about task learning at a higher level, i.e.,
learn to learn. 3) Learning framework. A common framework of meta-learning follows a bi-level
learning structure consistent with meta-optimization objectives. The inner-level learning focuses
on task-specific learning to generate training instances of the outer-level learning. The outer-level
learning is responsible for learning the meta-knowledge across multiple instances. For most regular
machine learning, only one level of learning is conducted over all supervised instances through
batch learning, which is the same as the inner-level learning in the meta-learning paradigm.
2.1.2 Mainstream Frameworks of Meta-learning Techniques. As summarized by previous meta-
learning surveys [38, 98], meta-learning techniques mainly fall into three categories, named metric-
based, model-based, and optimization-based meta-learning methods. Next, we will elaborate on the
formalization, technical details, and representative works of each category and discuss their pros
and cons compared with each other.
Metric-based Meta-learning resorts to the idea of metric learning and mainly represents meta-
knowledge 𝜔 in the form of a meta-learned feature space where the similarity of support instances
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
111:6 Chunyang Wang, Yanmin Zhu, Haobing Liu, Tianzi Zang, Jiadi Yu, Feilong Tang
and query instances are compared. Specifically, task-specific learning in metric-based techniques is
conducted in the form of non-parametric learning. In other words, in the inner-level learning of
each task, the parameters of the mapping function 𝑓𝜃 are not optimized to fit the training instances
S𝑖 but directly utilized to generate labels of evaluation instances Q𝑖 . For the mapping function
𝑓𝜃 , metric-based methods mainly rely on a similarity scoring function 𝑠𝑖𝑚(𝒙𝑖 , 𝒙 𝑗 ) which takes
embeddings of two instances (e.g., a training (support) instance and a evaluation (query) instance)
as inputs and calculates a similarity weight in the meta-learned feature space. Then the label of an
evaluation instance is assigned by the weighted combination of labels from all training (support)
instances. Formally, the predicted label vector 𝒚^𝑖 of a query instance 𝒙𝑖 in the task T𝑖 could be
obtained as follows: ∑︁
𝒚^𝑖 = 𝑠𝑖𝑚(𝒙𝑖 , 𝒙 𝑗 )𝒚 𝑗 (4)
(𝒙 𝑗 ,𝒚 𝑗 ) ∈S𝑖
Note that we simply present a basic form of metric-based meta-learning. In the litertature, the
similarity function 𝑠𝑖𝑚(𝒙𝑖 , 𝒙 𝑗 ) and label generating could be achieved in different forms such as
siamese nets [45], matching nets [100], prototypical nets [88], relation nets [95], and graph neural
networks [84].
For outer-level learning, metric-based meta-learning aims to learn the feature space for effectively
comparing instance similarity in new tasks. Therefore, the meta-knowledge 𝜔 coincides with
the parameters 𝜃 in the mapping function of the inner-level learning. Then 𝜃 are optimized by
minimizing the empirical loss over query set of multiple training tasks as equation 3. To be
mentioned, the 𝜃 T𝑖 is the same as the 𝜽 since the inner-level task-specific learning is non-parametric.
Model-based Meta-learning is another widely used meta-learning technique with the help
of the powerful representation ability of neural network structures. The key idea of model-based
methods is to meta-learn a model or a module to encode the internal states of a task by observing its
support instances. Conditioned on the internal states, the model-based meta-learner could capture
task-specific information and guide task-adaptive predictions for evaluation instances.
In the model-based meta-learning, inner-level learning mainly focuses on encoding the support
instances (or gradients) of the task into representations of the internal state with a neural network
structured model such as feed-forward networks, recurrent neural networks [35, 77], convolutional
neural networks [64] or hypernetworks [27, 74]. The predictions of query instances are usually
obtained with a modulated predictor conditioned on the encoded task-specific state representation.
Formally, the prediction of a query instance 𝒙𝑖 in the task T𝑖 could be obtained as follows:
𝒚^𝑖 = 𝑓𝑔𝜔 (𝜃,D𝑖 ) (𝒙𝑖 ) (5)
where the meta-knowledge 𝜔 plays a role in mapping task-specific states to modulation signals to
predictors or optimization strategies. In general, 𝜔 is represented in the form of a external meta
model 𝑔. The meta model 𝑔 could be instantiated with nerual networks [104] or external memories
[83]. For the outer-level learning, the optimization of the meta-learner is usually coupled into the
training of the inner-level mapping function since the outputs of the inner-level learning relies on
the outputs of the meta-learner.
Optimization-based Meta-learning strictly follows a bi-level optimization structure and sep-
arates the inner-level learning and outer-level learning via different gradient descent steps. We
take a famous framework namly model-agnostic meta-learning (MAML) as an example. These are
many studies extend the MAML frameworks. Specifically, in the inner-level learning, a base model
performs as the predictor and conducts a few steps of local optimization based on the emperical
loss over support instances as follows:
𝜃 T𝑖 = 𝜃 − 𝛾 ∇𝜃 L (𝑓𝜃 , S𝑖 ). (6)
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
Deep Meta-learning in Recommendation Systems: A Survey 111:7
where 𝜃 is the initialization of the base model parameters. We simply show one step of gradient
descent. By performing the local update of the base model, 𝜃 T𝑖 is utilized as the learned model
after task-specific learning of the task T𝑖 . Here, task-specific learning refers to regular gradient
descent based optimization, which is also the reason why this category is called optimization-based
meta-learning.
The meta-knowledge 𝜔 is represented in the form of parameter initialization in MAML, i.e., 𝜃 .
There are also other types of representation of meta-knowledge been studied. The 𝜃 is assigned
to each task as meta-learned global initialization before task-specific learning. Therefore, in the
outer-level learning, the 𝜃 is optimized by minimizing evaluation loss across different tasks to
ensure that the initialization has generalization capacity as the meta-knowledge. Formally, the
outer-level optimization, i.e., meta-optimization is conducted as follows:
∑︁
𝜃 ← 𝜃 − 𝛼∇𝜃 L (𝑓𝜃 T𝑖 , Q𝑖 ) (7)
T𝑖 ∈ D 𝑡𝑟𝑎𝑖𝑛
where the global intialization 𝜃 is updated across all tasks in the meta-training dataset D 𝑡𝑟𝑎𝑖𝑛 with
second-order gradients, since 𝜃 T𝑖 is obainted with gradient descents as equation (6).
Discussion: Pros and Cons These three frameworks of meta-learning techniques discussed
above roughly covers most of the existing meta-learning methods. We conclude their advantages
and disadvantages in terms of computation efficiency, the sensitivity of task distribution, and
applicability. First, metric-based meta-learning has a small computational burden since simple
similarity calculation requires no additional task-specific model update over new tasks. However,
when task distribution is complex, metric-based methods usually perform unstably in the meta-test
phase since no task information is absorbed to cope with task differences. Second, model-based
meta-learning has relatively simple optimization steps compared with optimization-based meta-
learning which requires second-order gradients. In addition, developed with diverse neural network
structures, model-based methods usually have broader applicability compared to the other two.
However, this category is criticized to perform worse over out-of-distribution tasks, i.e., is sensitive
to task distribution. Third, the key advantage of optimization-based meta-learning is that it is
usually agnostic to base model structure and could be compatible with diverse based models. In
practice, optimization-based meta-learning show better generalization ability when task distribution
is complex. However, this category of methods mainly suffers from heavy computation due to two
levels of gradient descents.
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
111:8 Chunyang Wang, Yanmin Zhu, Haobing Liu, Tianzi Zang, Jiadi Yu, Feilong Tang
networks [49] and so on. By doing this, representations of users and items are enhanced with
additional semantic information so that the demand for interaction data is weakened to some extent.
Besides that, the cold-start recommendation could be treated as an application in few-shot learning
where only a small number of samples are observed in each task. Similarly, recommendation tasks
for new users or items with sparse interactions are naturally divided into meta-training tasks,
and meta-learning techniques are widely utilized to alleviate the data insufficiency of cold-start
recommendation tasks [15, 47].
Click-through-rate prediction. In online advertising applications, click-through rate (CTR) is
a key index to determine the values of published ads [9, 87, 126]. A rational ad auction mechanism
should spend more cost on ads with higher CTRs, so as to ensure greater benefits. Therefore, accurate
CTR prediction provided by advertisement publishers could assist investors with subsequent
resource allocations. To estimate the click probability of a user-ad pair, recent CTR prediction
models usually follow a general framework consisting of two parts including an embedding layer
and a prediction layer [9]. Specifically, the embedding layer first learns latent embedding vectors
for both ad/user ids and other rich features. Then the prediction layer is utilized to model feature
interaction or dependencies with sophisticated models which are usually well designed as deep
neural structures. Despite its success in both academia and industry, the majority of these methods
work poorly on new ads due to the lack of embedding learning [69]. Known as the cold-start
problem in CTR prediction, embeddings (especially identity embeddings) of new ads which have
limited click records, are hard to be trained as well as other existing ads. As we investigated,
meta-learning methods have been studied to strengthen the embedding learning for cold-start ads.
Online recommendation. In practical large-scale recommender systems, real-time user in-
teraction data are generated and collected continuously. It is necessary to timely refresh the
recommendation models previously learned so that dynamic user preferences and trends could
be captured [28, 33]. Instead of offline training a model purely based on historical logs, online
recommendation attempts to continuously update current recommendation models based on newly
arrived data in an online fashion. Online learning strategies and model retraining mechanisms are
explored in this field to meet the needs. Due to practical requirements in real-world applications,
computation efficiency is a critical factor that should be emphasized. For instance, full retrain-
ing over both historical and new samples is an ideal strategy for model refreshing but is pretty
impractical for unacceptable time cost [123]. Therefore, to improve the ability of fast learning,
meta-learning has been introduced into online recommendation scenarios and used to quickly
capture dynamic preference trends from real-time user interaction data [71, 123].
Point-of-interest recommendation With the emergence of location-based social networks
(LBSNs), users are willing to share their visited point-of-interests (POIs) through check-in records.
LBSN services are supposed to provide personalized recommendations on other POIs that users
have not visited. Compared with general item (e.g., product, music, and movie) recommendation,
POI recommendation relies more on discovering spatial-temporal dependencies from historical
check-in data. This phenomenon is also very intuitive since users’ activities are largely influenced
by geospatial and temporal constraints. By incorporating geographical and time information
of check-in data, a series of approaches involving spatio-temporal modeling are proposed for
POI recommendation [92, 124]. Despite their success, the data sparsity issue is obvious in this
recommendation scenario since users must arrive at the location of shared POIs. In other words, it
is common that users just visited a small number of POIs because of the high cost of data generation.
Therefore, meta-learning based POI recommendation methods have been studied to face severe
data sparsity [11, 91].
Sequential recommendation The heart of sequential recommendation is to capture evolving
user preferences from users’ interaction sequences. Different from traditional collaborative filtering
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
Deep Meta-learning in Recommendation Systems: A Survey 111:9
methods which organize interactions in the form of user-item pairs, sequential recommendation
methods mainly utilize the sequence of previously interacted items of a user as input, and make
efforts to discover sequential patterns of user interest evolution. Specifically, representative sequen-
tial modeling methods including Markov Chains [31, 79], recurrent neural networks [34, 50], and
self-attention based networks [41, 112], have achieved promising performance in modeling both
short-term and long-term interests based on interaction sequences. However, the performances
of sequential recommenders usually rely on sufficient items in the sequences. When the number
of historical interactions is relatively small, model performance tends to degrade significantly
and fluctuate greatly. Consequently, the data sparsity issue also brings stubborn obstacles in the
sequential recommendation scenario.
3 TAXONOMY
In this section, we establish our taxonomy of deep meta-learning based recommendation systems
and summarize the characteristics of existing methods according to the taxonomy.
In general, we define our taxonomy in terms of three independent axes, including recommen-
dation scenarios, meta-learning techniques, and meta-knowledge representation. Fig.1 shows the
taxonomy. The previous taxonomy of general meta-learning methods proposed in [38, 98] cares
more about three categories of meta-learning frameworks as introduced in section 2.1 but pays
limited attention to practical applications of meta-learning techniques. In addition, [36] propose a
new taxonomy involving three perspectives including meta-representation, meta-optimizer and
meta-objective. They provide a more comprehensive breakdown that can orient the development of
new meta-learning methods. However, it focuses on the whole meta-learning landscape and is inap-
propriate to reflect the current research status and application scenarios in deep meta-learning based
recommendation systems. Therefore, we concentrate on the recommendation system community
and summarize the characteristics of existing works following three dimensions:
Recommendation scenarios (Where): This axis presents the specific scenario where the meta-
learning based recommendation methods are proposed and applied. As introduced in section
2.2, we summarize typical recommendation scenarios into the following groups 1) cold-start
recommendation, 2) click-through-rate prediction, 3) online recommendation, 4) point of interest
recommendation, 5) sequential recommendation, and 6) others. For clarity, we do not display all
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
111:10 Chunyang Wang, Yanmin Zhu, Haobing Liu, Tianzi Zang, Jiadi Yu, Feilong Tang
involved recommendation scenarios one by one but include less studied scenarios together and
denote them as others.
Meta-learning techniques (How): This axis presents the way how to apply meta-learning to
enhance generalization ability over new recommendation tasks. Following the taxonomy in [38, 98],
we also divide meta-learning techniques into three categories including metric-based meta-learning,
model-based meta-learning, and optimization-based meta-learning.
Meta-knowledge representations (What): This axis presents the form of meta-knowledge to
be represented so that it could be beneficial for improving the fast learning of recommendation
models. After distilling from existing works, we summarize common representations of meta-
knowledge as parameter initialization, parameter modulation, hyperparameters, sample weights,
embedding space, and meta model. Generally speaking, different meta-learning techniques have
distinct characteristics of meta-knowledge representation. For example, parameter initialization is
usually achieved under the optimization-based meta-learning while parameter modulation is more
likely to belong to model-based meta-learning. However, there are also situations where multiple
types of meta-knowledge representations are learned simultaneously in a hybrid manner.
By investigating existing works from the three independent dimensions above, our taxonomy is
expected to be able to provide a clear design space for deep meta-learning based recommendation
methods. we organize papers according to recommendation scenarios and present characteristics of
these works along with the taxonomy in table 2 and 3, which lists detailed publication information,
and highlights major meta-learning techniques and the forms of meta-knowledge representations.
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
Table 2. Summarization of all meta-learning based recommendation methods. We organize all these methods from hierarchical perspectives of scenarios and
111:11
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
meta-learning techniques. We use the following abbreviations. Optimi.: Optimization-based. Model: Model-based. Para. Init.: Parameter Initialization. Para.
Modu.: Parameter Modulation. Hyperpara.: Hyperparameter. Embedd. Space.: Embedding Space.
Meta-learning Technique Meta-learning Representions
Scenario Method Venue Year Para. Para. Hyper- Meta- Embed. Sample
Optimi. Model Metric
Init. Modu. para. Model space Weight
LWA [99] NIPS 2017 ✓ ✓
MeLU [47] KDD 2019 ✓ ✓
MetaCS [2] IJCNN 2019 ✓ ✓ ✓
MetaHIN [58] KDD 2020 ✓ ✓ ✓
MAMO [15] KDD 2020 ✓ ✓ ✓
MetaCF[107] ICDM 2020 ✓ ✓ ✓
TaNP [53] WWW 2021 ✓ ✓ ✓
Cold-start PALRML [119] AAAI 2021 ✓ ✓ ✓
Recommendation MIRec [122] WWW 2021 ✓ ✓ ✓
MPML [8] ECIR 2021 ✓ ✓
PAML[105] IJCAI 2021
Deep Meta-learning in Recommendation Systems: A Survey
✓ ✓ ✓
CMML [21] CIKM 2021 ✓ ✓ ✓
Heater [134] SIGIR 2021 ✓ ✓ ✓
PreTraining [30] SIGIR 2021 ✓ ✓
ProtoCF [82] Recsys 2021 ✓ ✓
MetaEDL [67] ICDM 2021 ✓ ✓
DML [66] AAAI 2022 ✓ ✓
PNMTA [70] WWW 2022 ✓ ✓ ✓
Meta-Embed. [69] SIGIR 2019 ✓ ✓ ✓
TDAML [5] ACMMM 2020 ✓ ✓ ✓ ✓
Click Through MWUF [133] SIGIR 2021 ✓ ✓ ✓
Rate Prediction DisNet [51] Complexity 2021 ✓ ✓ ✓
GME [68] SIGIR 2021 ✓ ✓ ✓
Meta-SSIN [94] SIGIR(short) 2021 ✓ ✓
PREMERE [43] AAAI 2021 ✓ ✓ ✓
MetaODE [97] MDM 2021 ✓ ✓
Point of Interest
MFNP [91] IJCAI 2021 ✓ ✓
Recommendation
CHAML [7] KDD 2021 ✓ ✓ ✓
Meta-SKR [11] TOIS 2022 ✓ ✓ ✓
Chunyang Wang, Yanmin Zhu, Haobing Liu, Tianzi Zang, Jiadi Yu, Feilong Tang
Table 3. Summarization of all meta-learning based recommendation methods. We organize all these methods from hierarchical perspectives of scenarios and
meta-learning techniques. We use the following abbreviations. Optimi.: Optimization-based. Model: Model-based. Para. Init.: Parameter Initialization. Para.
Modu.: Parameter Modulation. Hyperpara.: Hyperparameter. Embedd. Space.: Embedding Space.
Meta-learning Technique Meta-learning Representions
Scenario Method Venue Year Para. Para. Hyper- Meta- Embed. Sample
Optimi. Model Metric
Init. Modu. para. Model space Weight
S2Meta [17] KDD 2019 ✓ ✓ ✓ ✓
SML [123] SIGIR 2020 ✓ ✓ ✓
FLIP [57] IJCAI 2020 ✓ ✓
Online
FORM [93] SIGIR 2021 ✓ ✓ ✓
Recommendation
LSTTM [111] WSDM 2022 ✓ ✓
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
ASMG [71] Recsys 2021 ✓ ✓ ✓
MeLON [44] AAAI 2022 ✓ ✓ ✓
Mecos [125] AAAI 2021 ✓ ✓
Sequential MetaTL [102] SIGIR(short) 2021 ✓ ✓
Recommendation CBML [89] CIKM 2021 ✓ ✓ ✓
metaCSR [37] TOIS 2022 ✓ ✓
Cross Domain TMCDR [129] SIGIR(short) 2021 ✓ ✓
Recommendation PTUPCDR [132] WSDM 2022 ✓ ✓ ✓
Multi-behavior CML [108] WSDM 2022 ✓ ✓ ✓
Recommendation MB-GMN [110] SIGIR 2021 ✓ ✓ ✓
MetaKG [16] TKDE 2022 ✓ ✓
MetaSelector [59] WWW 2020 ✓ ✓ ✓
Meta-SF [46] SDM 2019 ✓ ✓
Others
MetaMF [54] SIGIR 2020 ✓ ✓ ✓
MetaHeac [131] KDD 2021 ✓ ✓
111:12 NICF [135] SIGIR 2021 ✓ ✓
Deep Meta-learning in Recommendation Systems: A Survey 111:13
different domains), while tasks in the cross-lingual setting are divided based on different languages.
Overall, The settings of meta-learning tasks in the other fields mentioned above are closely related
to the task objectives and data characteristics. Therefore, we specially discuss the construction of
meta-learning recommendation tasks and present how existing meta-learning methods perform
task division with interaction data from recommendation systems.
According to common properties belonging to interactions in a task, we mainly summarize the
task construction ways into four categories, including user-specific task, item-specific, time-specific
task and sequence-specific task. To be mentioned, there are a few works that have tried other ways
but the number is relatively small. We organize them all in the category named 𝑜𝑡ℎ𝑒𝑟𝑠. Table 4
shows the summary of works adapting each category of task construction.
User-specific Task. As observed in Table 4, the most typical way of task construction is based
on users. Since the user cold-start issue is the most long-standing problem in recommendation
systems, quickly learning preferences from users’ limited interactions is a critical task to be solved.
In the setting of user-specific task T𝑖 , all instances of a task including both the support set S𝑖
and the query set Q𝑖 are belonging to the same user. Learning preferences of different users are
naturally treated as different tasks. Give a illustrative example shown in Fig 2 (a). For a user-specific
task of a specific user 𝑢 1 , all his interactions are split into a support set S1 = {(𝑣 𝑗 , 𝑖𝑢𝑣 𝑗1 )}3𝑗=1 and
a query set Q1 = {(𝑣 𝑗 , 𝑟 𝑣𝑢𝑗1 )}5𝑗=4 , where 𝑖𝑢𝑣 𝑗1 could be a explicit rating score or a implicit feedback
between user 𝑢 1 and item 𝑣 𝑗 . The goal of each user-specific task is to train a model on the support
set and evaluation on the interactions in the query set of the same user. From the perspective of
the meta-optimization objective, meta-learning methods are expected to extract meta-knowledge
about user preference learning from a sufficient number of user-specific tasks D 𝑡𝑟𝑎𝑖𝑛 . Then when
faced with unseen user-specific tasks from new users, the meta-knowledge should work as prior
experiences to facilitate preference learning.
Item-specific Task. Symmetric with the user-specific task, an item-specific task is constructed
based on all instances involving the same item. From the view of an item, interaction instances are
grouped based on different items. As illustrated in Fig 2 (b), three item-specific tasks are constructed
according to three different items including a shirt, a shoe, and a phone. Similar to user-specific
tasks, meta-learning based item-specific tasks usually aim at tackling the item cold-start problem.
In this setting, the support set and the query set of a task cover all interactions between multiple
users and the same item. The goal of each item-specific task is to predict the ratings or interaction
probabilities of evaluation instances in the query set after observing the support set. By extracting
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
111:14 Chunyang Wang, Yanmin Zhu, Haobing Liu, Tianzi Zang, Jiadi Yu, Feilong Tang
Task 1 Task 2
support set support set
(a) User-specific
Task 1 Task 2 Task 3
support set support set support set
(b) Item-specific
Fig. 2. Illustration of task construction for user-specific tasks and item-specific tasks.
meta-knowledge across multiple item-specific tasks, meta-learning methods could quickly perceive
the overall preference for cold-start items, making accurate predictions and recommendations.
Time-specific Task. In this setting, interaction data in recommendation systems are split into
different tasks according to different time slots. Specifically, interaction data are considered as
collected continually and arrived in the form of data streaming. Formally, at the time 𝑡, data currently
collected is denoted as 𝐼𝑖 = {(𝑢𝑖 , 𝑣 𝑗 , 𝑖𝑢𝑣 𝑗𝑖 )}𝑀 . Different from user-specific or item-specific settings,
interactions in time-specific tasks are no longer distinguished by users or items. As shown in Fig
3 (a), time-specific tasks are sequentially constructed with data in two successive time slots. For
instance, for the task at time 2, the support set consists of the data block 𝐼 2 , i.e., data collected at the
current time slot. For the query set, data block 𝐼 2 in the next time slot 3 is utilized as evaluation data.
The reason for this setting is that the goal of a time-specific task is usually to efficiently update
models in an online setting so that the updated model could still perform well in the next period.
Meta-learning can also be used to facilitate the efficiency of model online updates by gradually
extracting meta-knowledge from sequential time-specific tasks.
Sequence-specific Task. As illustrated in Fig 3 (b), sequence-specific tasks are also constructed
with temporal information considered. Different from time-specific tasks which collect data at the
system level, the sequence-specific setting treats interaction sequences of different users or different
sessions as different tasks. For example, the whole interaction sequence of user 𝑢 1 is denoted as
{(𝑣 1, 𝑖𝑢𝑣11 ), (𝑣 2, 𝑖𝑢𝑣31 ), ..., (𝑣𝑡 , 𝑖𝑢𝑣𝑡1 )} which is ordered by interaction timestamps. For constructing a
sequence-specific task, the interaction sequence with the length 𝑡 is usually split into two parts.
The former 𝐾 interactions are allocated as the support set, while the latter 𝑡 − 𝐾 interactions
are allocated as the query set. There are two major differences between user-specific tasks and
sequence-specific tasks. First, sequence-specific tasks are not restricted by integrating interaction
users’ history. Anonymous sessions can also be independent interaction sequences. Second, the
form of instances in sequence-specific tasks are usually subsequences of the whole interaction
sequence, while user-specific tasks have interaction pairs.
Others. Besides the four types of tasks mentioned above, several works also explore other ways of
task construction. Scenario-specific tasks [17] are divided according to different scenarios (e.g., tags,
themes, or categories of items) in recommendation systems. Special for POI recommendation, city-
specific tasks [7, 97] organize interactions according to different cities, so that meta-knowledge could
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
Deep Meta-learning in Recommendation Systems: A Survey 111:15
Task 3 ...
Fig. 3. Illustration of task construction for time-specific tasks and sequence-specific tasks.
be extracted across multiple city-specific tasks and benefits data-sparse cities. Different from user-
specific tasks which utilize interactions of a single user as a task, interactions of multiple users could
also be combined and treated as one task [43, 129]. Specifically, in cross-domain recommendation
systems, Zhu et al. [129] randomly sample two groups of overlapping users (denoted as 𝑈𝑎 and
𝑈𝑏 ) and construct a cross-domain meta-learning task by gathering all interactions of multiple
users as a support set (i.e., 𝑆𝑖 = 𝐷𝑎 ) and a query set (i.e., 𝑄𝑖 = 𝐷𝑏 ), respectively. The goal of each
task is to learn an embedding mapping model from a source domain to a target domain for better
performance over cold-start users in the target domain (simulated with 𝐷𝑏 ), while meta-learning
contributes to the learning of the mapping model across multiple tasks. With a similar strategy of
task construction, Kim et al. [43] also separately samples two groups of multiple users as training
data in two update phases of a meta-learning task. Besides recommendation tasks, Hao et al. [30]
construct reconstruction tasks as pretraining tasks in their proposed meta-learning based cold-
start recommendation method. Each reconstruction task consists of a target user and 𝐾 samples
neighboring users and aims to reconstruct the target user’s embedding with his neighbors.
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
111:16 Chunyang Wang, Yanmin Zhu, Haobing Liu, Tianzi Zang, Jiadi Yu, Feilong Tang
... Meta-learner ℱ
...
Adaptive Hyperparameters
Fig. 4. Illustration of the framework of Optimization-based Parameter Initialization and Adaptive Hyper-
parameters. Based on two levels of optomization including local adaptation and global optimization, the
optimization-based meta-learner is updated across meta-training tasks. Both parameter initialization and
adaptive hyperparameters could be learned according to different designs of meta-learners.
analogy to the few-shot learning problem, cold-start recommendation is also paid more attention
and well studied by meta-learning based methods. Here, we summarize how existing works apply
meta-learning to alleviate the cold-start issues for both cold-start users and items into different
groups, including optimization-based parameter initialization, optimization-based parameterized
hyperparameters, model-based parameter modulation and metric-based embedding space learning.
Next, we will elaborate on different categories of methods and introduce details of concrete methods.
Optimization-based Parameter Initialization. Table 5 shows the summary of optimization-
based meta-learning methods in cold-start recommendation from three perspectives, i.e., cold-start
object, meta-knowledge representation, and key techniques used in the bi-level optimization
framework. Existing methods generally fall into two categories according to two forms of meta-
knowledge representations, including parameter initialization and adaptive hyperparameters. We
present a general framework for both optimization-based parameter initialization and adaptive
hyperparameters in Fig 4. In the following, We discuss concrete methods for parameter initialization
in this part and adaptive hyperparameters in the next part.
The basic idea of optimization-based parameter initialization is defining the meta-knowledge 𝜔 as
the initial parameters of base recommendation models and then updating the parameter initialization
in the form of bi-level optimization. Inspired by the idea of model-agnostic meta-learning[22], Lee
et.al [47] firstly introduce the MAML framework to cold-start recommendation and propose MeLU,
which aims to learn global parameter initialization of a neural network based recommendation
model as prior knowledge. The base model 𝑓𝜃 is implemented using fully connected neural networks
(FCNs), which act as a personalized user preference estimation model. Here, 𝜃 include transformation
parameters 𝑾 and bias parameters 𝒃 of both hidden layers and the final output layer in the base
recommendation model, which are to be initialized with globally learned parameter initialization 𝜔
via 𝜃 ← 𝜔. Following the bi-level optimization procedure, MeLU constructs user cold-start tasks
and locally updates the parameters of the personalized recommendation model for each user 𝑢𝑖
as the equation (6). After the local update process, a user-specific recommendation model 𝑓𝜃 T𝑖 is
especially learned for the task T𝑖 , and employed to make preference predictions on its unseen
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
Deep Meta-learning in Recommendation Systems: A Survey 111:17
query set Q𝑖 . In the global optimization procedure, global parameter initialization 𝜃 , which is
applied to the local update processes of multiple meta-training tasks simultaneously, is optimized
by minimizing the summed loss on query sets as equation (7). After iterative global update steps
during the meta-training phase, the global parameter initialization 𝜔 is supposed to have abilities
to quickly adapt to new cold-start recommendation tasks in the meta-testing set D 𝑡𝑒𝑠𝑡 . In MeLU,
the parameters of the user preference estimation model are optimized under the MAML framework
while user/item embeddings are only globally updated. In addition, MeLU is evaluated as effective
in handling both user and item cold-start issues by dividing both user and item into existing groups
and new groups.
Drawing on the idea of globally learning model initialization parameters across multiple cold-
start tasks, some other works are also proposed with the help of the original MAML framework.
On the basis of MeLU, Chen et al.[8] propose a multi-prior meta-learning approach MPML which
equips multiple sets of initialization parameters. For a cold-start task, which set of initialization to
be assigned depends on which performs better after local update over its support set. Besides simple
FCN-based collaborative filtering models, optimization-based meta-learning also have been utilized
to learn initialization for different forms of recommendation models. For instance, MetaEDL [67]
adopts the MAML framework to learn initialization parameters of an evidential learning enhanced
recommendation model which additionally assigns evidence to predicted interactions. Considering
the temporal evolution of user preferences, DML [66] is designed to continuously capture time-
evolving factors from all historical interactions of a user and fastly learn time-specific factors based
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
111:18 Chunyang Wang, Yanmin Zhu, Haobing Liu, Tianzi Zang, Jiadi Yu, Feilong Tang
Fig. 5. Illustration of different parameter initializationn strategies in three representative methods including
MeLU, MAMO and PAML. In short, MeLU shares global initialization among all tasks while MAMO and
PAML tailor task-specific initialization considering user profile and user preferences respectively.
on a small number of current interactions. Specifically, the module for capturing time-specific
factors is learned under the MAML framework in order to fastly adapt to each time period where
the number of the user’s interactions is usually small.
One promising line of extending the MAML framework is to take the task heterogeneity issue into
consideration by tailoring task-specific initialization for different tasks [15, 70, 105]. We present the
core idea of initialization strategies in two representative works in Fig 5. One representative work
MAMO [15] is proposed to provide a personalized bias term when initializing the recommendation
model parameters. Specifically, memory networks are introduced into optimization-based meta-
learning as external memory units to store task-specific fast weight memories. Before assigning the
global initialization learned under the MAML framework to base model, MAMO applies memory
units to generate a personalized bias term 𝑏𝑢𝑖 and obtain a task-specific initialization 𝜃𝑢𝑖 ← 𝜔 −𝜏𝑏𝑢𝑖 .
𝑏𝑢𝑖 is generated by querying fast weights memories 𝑀𝑊 with profile representation 𝑝𝑢𝑖 of a given
user 𝑢𝑖 as follows:
𝑏𝑢 𝑖 = 𝑎𝑇𝑢𝑖 𝑀𝑊 (8)
𝑎𝑢𝑖 = 𝑎𝑡𝑡𝑒𝑛𝑡𝑖𝑜𝑛(𝑝𝑢𝑖 , 𝑀𝑃 ) (9)
where 𝑀𝑃 is the profile memories stored in the training process and 𝑀𝑊 is the fast weights memories
storing training gradients as fast weights. As for the model and memories optimization, two memory
matrices are updated over the training task of 𝑢𝑖 as follows:
𝑀𝑃 = 𝜆(𝑎𝑢𝑖 𝑝𝑢𝑇𝑖 ) + (1 − 𝜆)𝑀𝑃 (10)
𝑀𝑊 = 𝛿 (𝑎𝑢𝑖 ∇𝜃 L (𝑓𝜃 , S𝑖 )) + (1 − 𝛿)𝑀𝑊 (11)
where 𝜆 and 𝛿 are hyperparameters as memory update ratios. Note that we only present one part
of the utilization of memories in MAMO, while more details and extensions could be seen in the
original paper [15]. Consequently, by injecting the profile-aware initialization bias 𝑏𝑢𝑖 , MAMO
tailors task-specific initialization 𝜃𝑢𝑖 to copy with task heterogeneity issue w.r.t. user profiles.
Following the same idea of customizing task-specific initialization, Wang et al. [105] also argue
that similar prior knowledge should be shared by users with similar preferences. Therefore, a
preference-adaptive meta-learning approach PAML is proposed to adjust the globally shared prior
initialization 𝜃 to the preference-specific initialization 𝜃𝑢𝑖 by applying an external meta model.
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
Deep Meta-learning in Recommendation Systems: A Survey 111:19
Specifically, the meta model acts as a preference-specific adapter by incorporating social relations
from social networks and semantic relations from heterogeneous information networks (HINs).
When customizing the preference-specific initialization 𝜃𝑢𝑖 , a series of preference-specific gates
𝒈𝒖 𝒊 are designed to control how much prior knowledge is shared, implemented as follows:
𝒈𝑢𝑖 = 𝜎 (𝑾𝑔 𝒖𝑖 + 𝒃𝑔 ) (12)
𝜃 𝑢𝑖 = 𝜃 ◦ 𝒈𝑢𝑖 (13)
where 𝒖𝑖 is the user preference representation learned from not only interactions of the user as
well as representations of his/her explicit friends extracted based on social relations and implicit
friends extracted from semantic relations, respectively. Since user relations are comprehensively
modeled by incorporating both social networks and HINs, final user preference representation 𝒖𝑖 is
supposed to trigger similar gates for users who share similar preferences. Finally, after obtaining
preference-specific initialization 𝜃𝑢𝑖 , optimization-based meta-learning (i.e., MAML framework) is
utilized to optimize parameters of both the base recommendation model and the meta model. Here,
the base recommendation model includes the preference modeling module previously discussed
and an FCN-based rating prediction module. Different from MAMO which focuses on user profile
information, PAML distinguishes different tasks mainly based on multiple types of user relations.
Without incorporating external task relations for revealing differences among tasks, Pang et
al. [70] propose PNMTA to discover implicit task distribution from users’ interaction contexts
and perform task-adaptive initialization adjustment. Specifically, a meta model F𝜔 is designed to
generate task-specific initialization 𝜃𝑢𝑖 for the base prediction model by conducting parameter
modulation as follows:
𝒘𝑖 , 𝒃𝑖 = F𝜔 (𝒕 𝒊 ) (14)
𝜃 𝑢𝑖 = 𝒘𝑖 ⊙ 𝜃 + 𝒃𝑖 (15)
where 𝒕 𝒊 is the task vector learned by aggregating all interaction representations. Conditioned on
the task representation, the meta model generates task-adaptive modulation signals, i.e., parameters
of the modulation function. Here, we present feature-wise linear modulation (FiLM) while other
types of modulation functions such as channel-wise modulation and soft attention modulation are
also discussed in the original paper. In the meta-training phase, both parameters of the meta-model
𝜔 and global initialization 𝜃 of the base model are optimized under the MAML framework.
Besides the extension over the meta-learning framework, MetaHIN [58] is proposed to augment
cold-start tasks from the perspective of task construction. Specifically, different from merely
regarding interacted items of a user as the support set S𝑖 , MetaHIN incorporates multifaceted
semantic contexts S𝑖P into tasks based on multiple meta-paths P = {𝑝 1, 𝑝 2, ..., 𝑝𝑛 } of heterogeneous
information network (HIN). For each meta-path 𝑝𝑘 , a set of items that are reachable from user 𝑢𝑖
are obtained via 𝑝𝑘 , denoted as S𝑖 𝑘 . By doing this, the semantic-enhanced support set is obtained
𝑝
as (S𝑖 , S𝑖P ), and semantic-enhanced query set is obtained similarly as (Q𝑖 , Q𝑖P ). After constructing
the semantic-enhanced tasks above, a co-adaptation meta-learner is designed to perform both
semantic- and task-wise adaptation to enhance the ability of local adaptation for each user. task-wise
adaptation to enhance the ability of local adaptation for each user. The co-adaptation adaptation
focuses on adapting to different semantic spaces induced by different meta-paths, respectively.
Overall, the conventional local adaptation phase in MAML is first augmented from the data level
by constructing semantic-enriched tasks and then enhanced with a co-adaptation meta-learner by
designing two levels of local adaptation.
Optimization-based Adaptive Hyperparameters. Besides parameter initialization of based
recommendation models, several works also leverage meta-learning to learn adaptive hyperparame-
ters for different cold-start tasks. For instance, MetaCS [2] adopts the similar bi-level optimization
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
111:20 Chunyang Wang, Yanmin Zhu, Haobing Liu, Tianzi Zang, Jiadi Yu, Feilong Tang
Table 6. Details of recommendation models with model-based meta-learning methods in cold-start recom-
mendation. The key role of designed meta models in different methods is summarized.
Cold-start
Method Base Model Key Role of Meta Model
object
LWA [99] Item LR / FCN Task-dependent Parameter Generation
Encoder &
TaNP [53] User Task Relevance aware Parameter Modification
Decoder
Parameter Generation from few-shot models
MIRec [122] Item FCN
to many-shot models
CMML [21] User FCN Task-dependent Parameter Modification
Heater [134] User & Item FCN Mixture-of-Experts based Parameter Integration
procedure as the MeLU, and additionally meta-update the value of local learning rate 𝛼 when
performing global optimization. The updating equation of the local learning rate is as follows:
∑︁
𝛼 ← 𝛼 − 𝛽∇𝛼 L (𝑓𝜃 T𝑖 , Q𝑖 ), (16)
𝑡𝑟𝑎𝑖𝑛 T𝑖 ∈ D
where 𝛼 is the parameterized learning rate for the local update and 𝛽 is a fixed learning rate for the
global update. They argue that the manually fixed learning rate may make the model unable to
converge. In this way, not only model parameters of the base model but also hyperparameters, e.g.,
learning rates, are meta-learned to provide prior knowledge. To be mentioned, the learnable update
ratio here is merely globally optimized but not updated during the local adaptation of each task.
With collaborative filtering methods as the base model, MetaCF [107] also leverages MAML
framework to meta-learn initialization for learnable parameters such as item embeddings in FISM
[40] and embedding transformation parameters in NGCF [106]. Similar to MetaCS [2], MetaCF
also adopts a flexible update strategy by learning appropriate learning rates automatically. While
performing task construction, MetaCF adopts another two strategies including dynamic subgraph
sampling and potential interactions extraction, which inject dynamicity and semantics into the
recommendation tasks.
Similarly, Yu et al. [119] proposes a personalized adaptive learning rate meta-learning approach
PALRML which sets different learning rates for different users to find task-adaptive parameters
for each task. They argue that assuming uniform user distribution in recommendation systems
may lead to the over-fitting problem of major users with similar features. In other words, minor
users whose features are different from the major ones may not be focused on. Therefore, PALRML
performs user-adaptive learning rate based meta-learning to improve the performance of the basic
MAML framework. Specifically, the local adaptation on each task T𝑖 is adjusted as:
𝜃 T𝑖 = 𝜃 − 𝛼 (ℎ𝑖 )∇𝜃 L (𝑓𝜃 , S𝑖 ). (17)
where 𝛼 (ℎ𝑖 ) is a mapping function for assigning an appropriate learning rate for each user 𝑢𝑖
according to the user’s feature embedding ℎ𝑖 . Three different strategies including adaptive learning
rate based, approximated tree-based, and regularizer-based are designed to provide personalized
learning rates. Low space complexity and good prediction performance are supposed to be achieved
simultaneously.
Model-based Parameter Modulation. Another category of meta-learning based approaches
for cold-start recommendation adopts model-based meta-learning for parameter modulation. The
core idea to is train a meta model F𝜔 which directly controls or alters the state of base recommen-
dation models without relying on inner-level optimization. More specifically, the form of meta
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
Deep Meta-learning in Recommendation Systems: A Survey 111:21
model is usually a learnable neural network that takes interactions in the support set of a task and
other useful information (such as losses or gradients) as input to learn task-specific information.
The ways of altering states of the base model for a task depend on the design of different methods,
i.e., the output form of the meta model. For instance, some works adopt parameter-generation
strategies, which directly treat the outputs of the meta model as the task-specific parameters of the
base model. Meanwhile, some works take more indirect ways such as gating-based modification of
globally shared parameters. We summarize three categories of parameter modulation strategies
including parameter generation, parameter modification, and parameter integration, which are
illustrated in Fig 6. Table 6 shows the summary of model-based parameter modulation methods.
One strategy for designing meta-models for parameter modulation is to directly generate task-
specific parameters of base models. For instance, Vartak et.at, [99] propose two models named
LWA and NLBA, to address item cold-start problem. Both LWA and NLBA adopt similar deep
neural network architectures as meta models to implement parameter generation strategies. The
differences of these two models are the form of recommendation models and parameters to be
adjusted. Specifically, take the LWA as the example, the meta-leaner F𝜔 consists of two sub-networks
G(.) and H (.). The first sub-network G(.) learns task representations based on interacted items
of a given user. Embeddings of positive interactions and negative interactions are aggregated as
𝑅𝑖 = G(𝐼 𝑝 ) and 𝑅𝑖𝑛 = G(𝐼 𝑛 ) respectively. The second sub-network H (.) directly adjusts the base
𝑝
model based on 𝑅𝑖 and 𝑅𝑖𝑛 by learining a vector 𝒘 𝒊 = 𝒘𝑝 𝑅𝑖 + 𝒘𝑛 𝑅𝑖𝑛 . Here, 𝒘 𝒊 are the generated
𝑝 𝑝
linear transformation parameters of a logistic regression (LR) function, which is specific for user 𝑢𝑖 .
Then the logistic regression function will act as the user-specific recommendation model to predict
the interaction probablity of a new item. Similarly, NLBA utilize a neural network classifier as the
base model and generate bias parameters of all hidden layers to implement paramter generation.
To improve tail-item recommendation, i.e., item cold-start recommendation, Zhang et.at [122]
propose MIREC, which focuses on transferring knowledge from head items with rich user feedback
to tail items with few interactions. Following the parameter-generation strategy in model-based
meta-learning, a meta-mapping module is designed to transfer parameters of a few-shot model
to a many-shot model, which achieves the model-level augmentation. Specifically, a meta model
F𝜔 learns to capture the model parameter mapping from a few-shot model to a many-shot model.
The meta-knowledge to be learned in MIREC could be explained as the knowledge about model
transformation when more training data are observed. Given a base model 𝑔𝜃 , many-shot model
𝑔𝜃 ∗ parameterized with 𝜃 ∗ is learned by feeding all user feedback. Then, to learn meta-knowledge
of model transformation, the meta model F𝜔 is incorporated into the training process of a few-shot
model 𝑔𝜃𝑘 (trained with tail items that have less than 𝑘 interactions) to by minimizing the following
objective function:
L (𝜔, 𝜃 𝑘 ) = ||F𝜔 (𝜃 𝑘 ) − 𝜃 ∗ || 2 + L𝑟𝑒𝑐 (𝑔𝜃𝑘 , 𝐷𝑘 ) (18)
where F𝜔 (.) tasks the parameters 𝜃 𝑘 of the few-shot model as input and generate many-shot model
parameters. The first L2 normalization term is utilized to train the parameter mapping ability of
F𝜔 from few-shot models and many-shot models. After training, the final recommendation model
is obtained by integrating both the original many-shot model 𝑔𝜃 ∗ and the meta-mapped few-shot
model 𝑔 F𝜔 (𝜃𝑘 ) , in order to perform well on both head and tail items.
Another common strategy of designing meta models for parameter modulation is to modify
globally shared parameters into task-specific ones. Instead of directly taking the outputs of meta
models as parameters of base models, the core idea of the parameter-modification strategy is to tailor
global parameters into task-specific ones under the control of meta models. Lin et.at [53] propose
TaNP, which designs a task relevance aware parameter modulation mechanism to customize
task-adaptive parameters for base recommendation models. Specifically, TaNP approximates each
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
111:22 Chunyang Wang, Yanmin Zhu, Haobing Liu, Tianzi Zang, Jiadi Yu, Feilong Tang
= ∗ + ℎ = + +... +
ℎ
...
Meta Model Meta Model
Meta Model
Fig. 6. Illustration of different parameter modulation strategies including parameter generation, parameter
modification, and parameter integration. One basic example is presented for each category.
task as an instantiation of a stochastic process and utilizes an encoder and decoder structure as the
preference estimation module, i.e., the base recommendation model. The meta model is designed
for modulating parameters of the decoder module. Specifically, the meta model F𝜔 first leverages a
task identity network to encode interactions and a learnable global pool to automatically learn the
relevance of different tasks. By doing this, the task representation is obtained as 𝒐𝑖 and utilized to
provide task relevance aware information for parameter modulation. Two candidate modulation
strategies including FiLM [72] and an extended Gating-FiLM are discussed to scale and shift the
parameters of hidden layers of the decoder. Take the FiLM as an example, for the user 𝑢𝑖 the
adjustment of the 𝑙-th hidden layer can be defined as:
𝑠𝑐𝑎𝑙𝑒𝑖𝑙 = 𝑡𝑎𝑛ℎ(𝑾𝑎𝑙 𝒐𝑖 ), 𝑠ℎ𝑖 𝑓 𝑡𝑖𝑙 = 𝑡𝑎𝑛ℎ(𝑾𝑏𝑙 𝒐𝑖 ), (19)
𝒙𝑖𝑙+1 = 𝑅𝑒𝐿𝑈 (𝑠𝑐𝑎𝑙𝑒𝑖𝑙 ⊙ 𝑙
(𝑾𝑑𝑒𝑐 𝒙𝑖𝑙 𝑙
+ 𝒃𝑑𝑒𝑐 ) + 𝑠ℎ𝑖 𝑓 𝑡𝑖𝑙 ) (20)
where 𝑾𝑑𝑒𝑐
𝑙 and are global parameters of the
𝒃𝑑𝑒𝑐
𝑙 encoder. 𝑠𝑐𝑎𝑙𝑒𝑖𝑙 and 𝑠ℎ𝑖 𝑓 𝑡𝑖𝑙
are generated model
modulation signals by the meta model. 𝒙𝑖 is the inputs of the 𝑙-th layer of the decoder. In this way,
𝑙
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
Deep Meta-learning in Recommendation Systems: A Survey 111:23
Item ID Item Attributes Item Attributes Relevant Items Item Attributes Interacted Users
(a) Embedding lookup (b) Optimization-based Embedding Initialization (c) Model-based Embedding Generation
Fig. 7. Illustration of different structures of embedding generators in meta-learning methods for CTR predic-
tion. We mainly compare them from what kind of auxiliary information is considered when generating initial
embeddings or warm embeddings for new items.
of 𝑀 parallel experts with the same structure. Each expert 𝑓 𝑚 takes the user representation 𝒖𝑖 as
input, and outputs a transformed representation 𝑓 𝑚 (𝒖𝑖 ) of the user. The parameter-modification
strategy works by adaptively combining outputs of all experts {𝑓 1 (𝒖𝑖 ), ..., 𝑓 𝑀 (𝒖𝑖 )} with learnable
weights. This is equivalent to an adaptive integration of the parameters of multiple experts. As a
result, the final transformation function 𝑓𝑖𝑈 is user-specific for each user.
Metric-based Embedding Space Learning. Metric-based meta-learning is also utilized in cold-
start recommendation to meta-learn embedding space for embedding similarity comparison. To
alleviate cold-start problem in long-tail item recommendation, Sankar et al.[82] proposes ProtoCF
which learns a shared metric space for measuring embedding similarities between candidate cold-
start items and users. Specifically, inspired by the Prototypical Networks [88], ProtoCF learns to
compose discriminative prototypes for tail items from their few-shot interactions. Based on the
support set S𝑖 , the prototype representation for each item 𝑣𝑖 is first computed as the mean vector
of pretrained user embeddings. Then, a fixed number of group embeddings are learned as external
memories to enrich prototype representations of each item. Finally, following the framework of
metric learning, given a query user, the similarities between prototype representations {𝒑 1, ..., 𝒑 𝑁 }
of candidate items and the user representation 𝒖𝑖 are computed in the meta-learned metric space.
Borrowing the idea of measuring embedding similarity, Hao et al. [30] study how to pretrain
GNNs to learn embeddings for cold-start users and items via few-shot reconstruction tasks. Instead
of learning embedding space for calculating embedding similarity between users and items, the
PreTraining approach focuses on learning reconstruction space for comparing reconstructed
embeddings of few-shot users/items and their ground truth embeddings learned from abundant
interactions. Reconstruction tasks first select target users/items that have sufficient interactions
and simulate cold-start situations by sampling a few neighbors for each target user/item. Assuming
embeddings trained with abundant interactions are ground truths, the goal of the reconstruction
tasks is to reconstruct embeddings based on few-shot neighbors. By measuring and maximizing
the similarities among the reconstructed embeddings and the ground truths, the pretrained GNNs
are supposed to learn effective embedding space for cold-start users and items.
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
111:24 Chunyang Wang, Yanmin Zhu, Haobing Liu, Tianzi Zang, Jiadi Yu, Feilong Tang
Table 7. Details of recommendation models with meta-learning methods in click through rate prediction.
performance. The main idea of this category is to design an external ID embedding generator as
a meta-learner, and apply it to generate adaptive initial ID embeddings for different items newly
arrived. The meta-learner is trained under the optimization-based meta-learning framework.
Pan et al., [69] firstly propose the idea of meta-learning a initial embedding generator to replace
the randomly intialization strategy for click-through rate prediction problem. Specificially, as shown
in Fig 7 (b), an item/Ad features based embedding generator named Meta-Embedding is designed to
take Ad attributes as inputs and generate item-specific initial embeddings 𝒗𝑖𝑖𝑛𝑖 . Then the generated
user ID embedding 𝒗𝑖𝑖𝑛𝑖 is combined with other feature embeddings such as user embeddings,
item attribute embedddings and context embeddings and fed into pretrained predcition models,
e.g., DeepFM [29], PNN [76], Wide&Deep [9]. For the meta-optimization of the Meta-Embedding
generator, two batches of labeled instances are sampled for each cold-start item. The first batch
D𝑖𝑎 is utilized to evaluate the cold-start performance by directly making predictions with 𝒗𝑖𝑖𝑛𝑖 . The
second batch D𝑖𝑏 is utilized to evaluate the warm-up performance by making predictions with
item embedding 𝒗𝑖𝑤𝑎𝑟𝑚 which is locally updated over the first batch data D𝑖𝑎 . By doing this, two
losses L𝑐𝑜𝑙𝑑 (𝒗𝑖𝑖𝑛𝑖 , D𝑖𝑎 ) and L𝑤𝑎𝑟𝑚 (𝒗𝑖𝑤𝑎𝑟𝑚 , D𝑖𝑏 ) are obtained in cold-start phase and warm-up phase,
respectively. Based on a unified loss, i.e., L𝑚𝑒𝑡𝑎 = 𝛿 L𝑐𝑜𝑙𝑑 + (1 − 𝛿)L𝑤𝑎𝑟𝑚 , outer-level update of
optmization-based meta-learning is performed to globally optimize the generator through gradient
descent.
Following the idea of item ID embedding generation, several works mainly extend the forms of
embedding generators by leveraging other auxiliary information besides item attributes, especially
information from relevant users and relevant items. Ouyang et al. [68] propose a series of graph
meta embedding (GMEs) models to learn initial item embeddings based on not only item attributes
but also existing relevant items. As shown in Fig 7 (b), GMEs first connect existing items with new
items with graphs through shared item attributes and then apply the graph attention networks to
distill neighborhood information for generating embeddings of cold-start items. Three different
strategies for distilling information from existing items including pre-defining item embeddings,
generating item embeddings from item attributes, and directly aggregating attribute embeddings
without learning ID embeddings, are discussed in different variants of GMEs. Similar to the Meta-
Embedding, GMEs also resort to the optimization-based meta-learning framework to train the
graph neural network based embedding generator with two sampled batches of each task. Similarly,
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
Deep Meta-learning in Recommendation Systems: A Survey 111:25
Li et al. [51] proposes a deep interest-shifting network DisNet which includes a meta-Id-embedding
generator (RM-IdEG) module as the initial ID embedding generator. RM-IdEG mainly collects a
set of existing items relevant to the target cold-start item through item relations and learns an
attentional representation as to the initial ID embedding. Similar to the Meta-Embedding, the
optimization of RM-IdEG is separated from pretraining the whole DisNet model and conducted by
minimizing both cold-start loss and warm-up loss with optimization-based meta-learning.
Under the framework of optimization-based item embedding initialization, i.e. Meta-Embedding,
the optimization strategy is also studied to improve the adaptation ability against the diversity of
the task difficulty. Cao et al. [5] proposed a task-distribution-aware meta-learning method (shorted
as TDAML) to ensure the consistency between the loss weight and task difficulty when globally
updating the embedding generator. They argue that different tasks should have different difficulties
in the meta-training phase and assigning equal weights to all tasks may pay limited attention to
the hard tasks. On top of the meta-embedding framework, TDAML proposes to adaptively assign
different weights when summing the meta losses of different tasks. By modeling the weights 𝒑𝑖
of meta-losses as the description of task difficulty, extra constraints expecting strong consistency
between 𝒑𝑖 and meta-loss of the task, i.e., L𝑚𝑒𝑡𝑎
𝑖 , are added to find an adaptive loss weight which
replaces the uniform weight. As a result, the meta-optimization phase could pay more attention to
the harder tasks and achieves better performance improvement.
Model-based Item Embedding Generation. Besides optimization-based techniques, model-
based meta-learning is also applied to generate initial item embeddings for better click-through
rate prediction performance. Zhu et al. [133] propose MWUF which aims to meta-learn scaling
and shifting functions for generating ID embeddings of cold-start items. As shown in Fig 7 (c),
different from optimization-based item embedding initialization above, MWUF directly transforms
the cold item ID embedding 𝒗𝑖𝑐𝑜𝑙𝑑 of the item 𝑣𝑖 to a warm item ID embedding 𝒗𝑖𝑤𝑎𝑟𝑚 by applying a
scaling and shifting function as follows:
𝒗𝑖𝑤𝑎𝑟𝑚 = 𝒗𝑖𝑐𝑜𝑙𝑑 · ℎ𝑠𝑐𝑎𝑙𝑒 (𝒙𝑖 ) + ℎ𝑠ℎ𝑖 𝑓 𝑡 (𝑼𝑖 ), (21)
where 𝒙𝑖 denotes the item feature embedding of the item 𝑣𝑖 and 𝑼𝑖 denotes embeddings of its
interacted users. Here a meta scaling network ℎ𝑠𝑐𝑎𝑙𝑒 (∗) takes 𝒙𝑖 as input and generate personalized
scaling parameters. A meta shifting network ℎ𝑠ℎ𝑖 𝑓 𝑡 (∗) takes 𝑼𝑖 as input and generate personalized
shifting parameters. After obtaining the warm ID embedding 𝒗𝑖𝑤𝑎𝑟𝑚 , MWUF directly make predic-
tions based on pretrained recommendation models such as Wide&Deep [9], DIN [127] and AFM
[10]. The meta models, i.e., two meta networks, are optimized by minimizing the warm loss, which
is obtained by making predictions with 𝒗𝑖𝑤𝑎𝑟𝑚 over observed interactions of the item 𝑣𝑖 .
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
111:26 Chunyang Wang, Yanmin Zhu, Haobing Liu, Tianzi Zang, Jiadi Yu, Feilong Tang
Meta-learning Meta-knowledge
Method Task Division
Technique representation
Parameters Initialization &
S2Meta [17] Optimization-based Scenario-specific
Meta-learner & Hyperparameter
FLIP [57] Optimization-based Sequence-specific Parameters Initialization
Parameter Initialization
FORM [93] Optimization-based User-specific
& Hyperparameter
SML [123] Model-based Time-specific Meta Model
ASMG [71] Model-based Time-specific Meta Model
LSTTM [111] Optimization-based Time-specific Parameter Initialization
MeLON [44] Model-based Time-specific Meta Model & Hyperparameter
preference from newly arrived user visit sequences, FLIP separately learns intent embeddings only
based on the interactions of the current session while learning the preference embedding of the
user during the whole online learning procedure. Specifically, inspired by an optimization-based
meta-learning framework under online setting Online MAML [23], FLIP learns the initial intent
embedding for all sessions which is expected to quickly adapt to each new session. The support set
of a task consists of the first 𝑚 interactions in the session, and the rest is treated as the query set. The
outer-level update of the initial intent embedding is performed across a batch of tasks. Therefore, by
learning user intent embedding with optimization-based meta-learning techniques, FLIP enhances
the ability of user-level preference updating, especially capturing short-term preference evolution
during the online learning procedure.
Another work FORM [93] also studies meta-learning-based online recommendation based on
user-specific task division. To adapt the optimization-based meta-learning to fluctuating online
scenarios, FORM enhances the MAML framework to provide a more stable training process in
the following directions. First, during local updates of current interactions of a user, a follow the
online meta leader (FTOML) algorithm is designed to preserve prior knowledge extracted from
all historical interactions of the user. In this way, the updated model during the online training
procedure is expected to perform well on not only current data but also prior data, which stables
user preference learning. Second, to ensure a consistent update process, a regularized term is added
to the loss function to restrict the model parameters as sparse. Third, considering that users with
abundant interactions have fewer fluctuations, FORM is designed to assign larger learning rates to
users who have larger record lengths and smaller variance of gradients. With the three designs for
tackling the fluctuating and noisy nature of online scenarios, FORM is expected to provide a more
stable meta-optimization phase for online recommenders.
Scenairo-level Model Updating. Besides conducting user-level preference learning, Du et al.
[17] considers scenario-specific recommendation tasks and proposes a sequential meta-learner
S2Meta to automatically learn personalized models for newly appeared scenarios. For instance,
scenario-specific tasks could be defined according to item category, item tag, theme events, and
so on. When a small size of interactions are collected online in a new scenario 𝑠𝑖 , S2Meta aims to
fastly update an initial base model 𝑓𝜃 to a scenario-specific recommendation model 𝑓𝜃𝑖 . Specifically,
the meta-knowledge to be globally learned is defined as three factors controlling the inner-level
learning, including initial parameters, learning rates, and early-stop policy. The local update of
each recommendation task is considered as a sequential learning process consisting of initializing,
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
Deep Meta-learning in Recommendation Systems: A Survey 111:27
finetuning with adaptive learning rates, and stopping timely. The sequential learning process is
automatically controlled under three parts of a designed meta model which is learned under the
optimization-based meta-learning framework.
System-level Model Retraining. Online recommendation systems usually require periodical
model retraining with new instances to capture current trends effectively. Recently, several works
formalize the model retraining tasks from the perspective of meta-learning and study meta-learning
based model retraining in the online recommendation [44, 71, 111, 123].
Zhang et.al [123] firstly investigate the model retraining mechanism from the scheme of meta-
learning. At a time period 𝑡, the model retraining task T𝑡 is constructed with interactions 𝐷𝑡
collected currently as the support set, and interactions 𝐷𝑡 +1 in the next time period as the query set.
The goal of the model retraining task T𝑡 is to incrementally update the recommendation model 𝑓𝜃𝑡 −1
obtained in the 𝑡 − 1 time period to a new one 𝑓𝜃𝑡 which is expected to achieve better performance
in the next time period, i.e., 𝑡 + 1. Zhang et al. apply model-based meta-learning techniques to
directly transfer parameters 𝜃 𝑡 −1 to model parameters 𝜃 𝑡 with a meta model. Specifically, the meta
model utilizes convolutional neural networks as a transfer component which inputs previous
parameters 𝜃 𝑡 −1 and parameters 𝜃^𝑡 that are locally updated over 𝐷𝑡 . The parameters of the next
recommendation model 𝑓𝜃𝑡 are generated from the outputs of the transfer component. To make
the learned model serve well in the next time period, the loss over 𝐷𝑡 +1 is observed to update the
parameters of the meta model. Since the meta-learning based model retraining framework above is
operated in a sequential manner, thus the method is named Sequential Meta-Learning (SML).
Following the idea of SML, Peng et al. [71] propose another model retraining method ASMG,
which is devise to generate the current model 𝑓𝜃𝑡 based on a sequence of historical models
{𝑓𝜃 1 , ..., 𝑓𝜃𝑡 −1 }. Different from SML, ASMG replaces the CNN-based transfer module with gated
recurrent units (GRU) as a meta-generator that captures long-term sequential patterns in model
evolution. The meta generator inputs a truncated sequence of historical models of previous periods
sequentially. Then the final hidden state 𝒉𝑡 of the GRU is transformed to generate the parameters
of current model 𝑓𝜃𝑡 . Similar to SML, the meta-generator in ASMG is also optimized towards better
performance over interactions of the next time period 𝑡 + 1.
Different from SML which focuses on updating parameters based on the whole data in the current
time, one up-to-date approach MeLON [44] further distinguishes the importance of different
interactions in the data of the same time. Specifically, given an interaction 𝑟 , MeLON aims to
learn a adaptive learning rate 𝛼𝑟,𝑚 for 𝑚-th dimension 𝜃 𝑡𝑚 of current model parameters 𝜃 𝑡 . A meta
model is designed to generate the adaptive learning rate based on information from both the
interaction (e.g., relevant historical interactions) and the parameter (e.g. loss and gradient). By
assigning adaptive learning rates for each interaction-parameter pair, MeLON hopes to be able to
update recommendation models more flexibly in online scenarios.
Besides the model-based meta-learning techniques above, model retraining is also studied under
the optimization-based meta-learning framework. Xie et al. [111] propose LSSTM for online recom-
mendation, which relies on graph neural networks based recommendation models to extract user
short-term and long-term preferences. Considering the dynamic nature of short-term preferences in
online scenarios, LSTTM constructs model retraining tasks according to different time periods and
applies optimization-based meta-learning to learn better initialization of a short-term graph module.
Instead of training only based on current data with meta-learning, the global long-term graph
module is trained constantly during the whole online learning phase. In this way, short-term pref-
erence for new trends or hot topics is captured timely from the recent interactions while long-term
preference which reflects users’ stable interests is also maintained after the model retraining.
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
111:28 Chunyang Wang, Yanmin Zhu, Haobing Liu, Tianzi Zang, Jiadi Yu, Feilong Tang
Sequential Meta-knowledge
Method Task Division
Information representation
Meta model &
PREMERE [43] User-specific Sequential-free
Sample Weight
MFNP [91] User-specific Sequential-aware Parameter Initialization
Parameter Initialization
CHAML [7] City-specific Sequential-aware
& Sample Weight
Parameter Initialization
Meta-SKR [11] User-specific Sequential-aware
& Meta Model
MetaODE [97] City-specific Sequential-aware Parameter Initialization
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
Deep Meta-learning in Recommendation Systems: A Survey 111:29
of samples during the sampling phase, PREMERE randomly sample instances but focuses on
reweighting losses of instances in the sampled batch.
Optimization-based Parameter Initialization. Recently, optimization-based meta-learning
methods are also leveraged to learn parameter initialization of specific modules in the next POI rec-
ommendation models. Sun et al. [91] propose MFNP, which captures user-specific preferences and
region-specific preferences with two LSTM-based modeling modules, respectively. By initializing
the parameters of the recommendation model, MFNP locally updates models on corresponding
support sets for different users and globally optimizes the initialization via the MAML framework.
Another work [11] proposes a sequential knowledge graph based recommendation model Meta-
SKR for the next POI recommendation. By jointly modeling sequential, geographical, temporal,
and social information with designed sequential knowledge graphs, the next POI recommendation
problem is considered as a link prediction based on graph embedding learning. To alleviate the
check-in sparsity problem in embedding learning, an optimization-based meta-learning frame-
work LEO [81] is adopted to generate the weights of the GRU-based and GAT-based sequential
embedding network which learns node embeddings from the sequential knowledge graphs. In
addition, optimization-based meta-learning is also utilized in MetaODE [97] to learn parameter
initialization across multiple source cities with sufficient data, so as to gain better generalization
over data-insufficient cities.
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
111:30 Chunyang Wang, Yanmin Zhu, Haobing Liu, Tianzi Zang, Jiadi Yu, Feilong Tang
(i.e., the query set) given previous transition pairs {𝑖 𝑗 → 𝑖 𝑗+1 }𝑡𝑗=−11 (i.e., the support set). The
transition-based recommendation model aggregates the trainstional information of the user 𝑢𝑖
based on multiple pairs in the support set to obtain a relation representation 𝒓𝑢𝑖 and calculates the
preference score as −||𝒊𝑡 + 𝒓𝑢𝑖 − 𝒊𝑡 +1 || 2 . MetaTL also applies MAML framwork to learn effective
global intialization of the transition model for all cold-start users.
Different from applying optimization-based meta-learning to learn suitable initialization of
sequential model, metric-based meta-learning is also studied in the cold-start sequential recom-
mendation scenario. Zheng et al.[125] propose Mecos to address the item cold-start issue in the
sequential recommendation. They firstly construct 𝑁 -way 𝐾-shot classification task by sampling
K sequences for N cold-start items, respectively. Then, Mecos learns holistic representations for
support sets and query sets of different items and leverages a matching network to calculate the
similarity scores between each support and query pair, so as to generate classification results of
the 𝑁 query sets according to the similarity metric. The matching network is optimized in the
meta-training phase with constructed classification tasks and could be directedly utilized to make
predictions without local adaptation over meta-testing tasks.
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
Deep Meta-learning in Recommendation Systems: A Survey 111:31
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
111:32 Chunyang Wang, Yanmin Zhu, Haobing Liu, Tianzi Zang, Jiadi Yu, Feilong Tang
of the range of deep meta-learning that we discussed in this survey. More related works could be
found in [13, 78].
Recently, Luo et al.[59] have studied recommendation model selection problem under the frame-
work of optimization-based meta-learning. Given a collection of recommendation models, a model
selector MetaSelector is designed to adaptively ensemble all models by generating soft selec-
tion weights. By regarding each task as learning suitable model selection weights for a user, the
model selector is optimized across different model selection tasks under an adaptive learning rate
augmented MAML framework. In the local adaptation phase, for each task, the model selector
is first locally updated with the support set of the user and then generate personalized model
selection weights to evaluate its effectiveness over the query set. In the global optimization phase,
the initialization of the model selector is updated across multiple tasks to make sure fast adaptation
to new model selection tasks. Note that these recommendation models should be pretrained with
all data and kept fixed in the meta-training phase.
6 FUTURE DIRECTIONS
In this section, we analyze the limitations of existing deep meta-learning based recommendation
methods and outline some prospective research directions which worth exploring in the future.
6.1 Meta-Overfitting
Generalization across different tasks is the key capacity of meta-learning, and it mainly depends
on how well meta-learners fit the whole task distribution with meta-training tasks. Similar to
overfitting over training instances in conventional machine learning, the meta-overfitting issue
occurs when meta-learners merely memorize all meta-training tasks but fail to adapt to novel tasks
(i.e, meta-testing tasks) [116]. Since the number of training tasks is usually much smaller than the
number of instances, the meta-overfitting problem is more severe in meta-learning compared with
regular supervised learning [36]. In the field of recommendation systems, existing meta-learning
methods mainly construct a fixed and limited number of tasks as summarized in section 4, and thus
are likely to suffer from meta-overfitting over meta-training tasks. One straightforward strategy
against meta-overfitting is conducting task augmentation during task construction. For instance, for
constructing typical few-shot classification tasks, 𝑁 classes are randomly sampled and 𝐾 instances
of each class are also randomly sampled. In this way, not only the volume of available tasks is greatly
increased, but also these tasks are kept mutually exclusive. Some other efforts of task augmentation
[56, 65, 128], meta-regularization [116] and Bayesian meta-learning [118] are also studied and
proven effective in addressing the meta-overfitting issue. Therefore, it is a promising direction for
developing meta-learning based recommendation models with better meta-generalization abilities.
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
Deep Meta-learning in Recommendation Systems: A Survey 111:33
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
111:34 Chunyang Wang, Yanmin Zhu, Haobing Liu, Tianzi Zang, Jiadi Yu, Feilong Tang
structural connections and model weights is also proven beneficial to each other for better opti-
mization [14]. Thus, designing a neural network architecture search framework to automatically
specify recommendation models for different tasks or datasets could be another future direction.
7 CONCLUSION
The rapid development of deep meta-learning methods has propelled the progress in the research
field of recommender systems in recent years. This paper provides a timely survey after systemati-
cally investigating a large number of related papers in this area. We broke it down into a taxonomy
of recommendation scenarios, meta-learning techniques, and meta-knowledge representations. For
each recommendation scenario, technical details about how to apply meta-learning are introduced
for existing methods. Finally, we point out several limitations in current research and highlight
some promising future directions to promote research in meta-learning based recommendation
methods. We hope our survey can be beneficial for both junior and experienced researchers in the
relative areas.
REFERENCES
[1] Irwan Bello, Barret Zoph, Vijay Vasudevan, and Quoc V Le. 2017. Neural optimizer search with reinforcement
learning. In International Conference on Machine Learning. PMLR, 459–468.
[2] Homanga Bharadhwaj. 2019. Meta-learning for user cold-start recommendation. In 2019 International Joint Conference
on Neural Networks (IJCNN). IEEE, 1–8.
[3] Daniel Billsus, Michael J Pazzani, et al. 1998. Learning collaborative information filters.. In Icml, Vol. 98. 46–54.
[4] Qi Cai, Yingwei Pan, Ting Yao, Chenggang Yan, and Tao Mei. 2018. Memory matching networks for one-shot image
recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4080–4088.
[5] Tianwei Cao, Qianqian Xu, Zhiyong Yang, and Qingming Huang. 2020. Task-distribution-aware Meta-learning for
Cold-start CTR Prediction. In Proceedings of the 28th ACM International Conference on Multimedia. 3514–3522.
[6] Rich Caruana. 1997. Multitask learning. Machine learning 28, 1 (1997), 41–75.
[7] Yudong Chen, Xin Wang, Miao Fan, Jizhou Huang, Shengwen Yang, and Wenwu Zhu. 2021. Curriculum meta-learning
for next POI recommendation. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data
Mining. 2692–2702.
[8] Zhengyu Chen, Donglin Wang, and Shiqian Yin. 2021. Improving cold-start recommendation via multi-prior meta-
learning. In European Conference on Information Retrieval. Springer, 249–256.
[9] Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg
Corrado, Wei Chai, Mustafa Ispir, et al. 2016. Wide & deep learning for recommender systems. In Proceedings of the
1st workshop on deep learning for recommender systems. 7–10.
[10] Weiyu Cheng, Yanyan Shen, and Linpeng Huang. 2020. Adaptive factorization network: Learning adaptive-order
feature interactions. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 3609–3616.
[11] Yue Cui, Hao Sun, Yan Zhao, Hongzhi Yin, and Kai Zheng. 2021. Sequential-knowledge-aware next POI recommenda-
tion: A meta-learning approach. ACM Transactions on Information Systems (TOIS) 40, 2 (2021), 1–22.
[12] Tiago Cunha, Carlos Soares, and André CPLF de Carvalho. 2016. Selecting collaborative filtering algorithms using
metalearning. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer,
393–409.
[13] Tiago Cunha, Carlos Soares, and André CPLF de Carvalho. 2018. Metalearning and Recommender Systems: A
literature review and empirical study on the algorithm selection problem for Collaborative Filtering. Information
Sciences 423 (2018), 128–144.
[14] Yadong Ding, Yu Wu, Chengyue Huang, Siliang Tang, Yi Yang, Longhui Wei, Yueting Zhuang, and Qi Tian. 2022.
Learning to Learn by Jointly Optimizing Neural Architecture and Weights. In Proceedings of the IEEE/CVF Conference
on Computer Vision and Pattern Recognition, Vol. 2.
[15] Manqing Dong, Feng Yuan, Lina Yao, Xiwei Xu, and Liming Zhu. 2020. Mamo: Memory-augmented meta-optimization
for cold-start recommendation. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge
Discovery & Data Mining. 688–697.
[16] Yuntao Du, Xinjun Zhu, Lu Chen, Ziquan Fang, and Yunjun Gao. 2022. MetaKG: Meta-learning on Knowledge Graph
for Cold-start Recommendation. IEEE Transactions on Knowledge and Data Engineering (2022).
[17] Zhengxiao Du, Xiaowei Wang, Hongxia Yang, Jingren Zhou, and Jie Tang. 2019. Sequential scenario-specific meta
learner for online recommendation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
Deep Meta-learning in Recommendation Systems: A Survey 111:35
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
111:36 Chunyang Wang, Yanmin Zhu, Haobing Liu, Tianzi Zang, Jiadi Yu, Feilong Tang
[41] Wang-Cheng Kang and Julian J. McAuley. 2018. Self-Attentive Sequential Recommendation. In IEEE International
Conference on Data Mining, ICDM. IEEE Computer Society, 197–206.
[42] Jaehong Kim, Sangyeul Lee, Sungwan Kim, Moonsu Cha, Jung Kwon Lee, Youngduck Choi, Yongseok Choi, Dong-Yeon
Cho, and Jiwon Kim. 2018. Auto-meta: Automated gradient based meta learner search. arXiv preprint arXiv:1806.06927
(2018).
[43] Minseok Kim, Hwanjun Song, Doyoung Kim, Kijung Shin, and Jae-Gil Lee. 2021. PREMERE: Meta-Reweighting via
Self-Ensembling for Point-of-Interest Recommendation. In Proceedings of the AAAI Conference on Artificial Intelligence,
Vol. 35. 4164–4171.
[44] Minseok Kim, Hwanjun Song, Yooju Shin, Dongmin Park, Kijung Shin, and Jae-Gil Lee. 2022. Meta-Learning for
Online Update of Recommender Systems. (2022).
[45] Gregory Koch, Richard Zemel, Ruslan Salakhutdinov, et al. 2015. Siamese neural networks for one-shot image
recognition. In ICML deep learning workshop, Vol. 2. Lille, 0.
[46] Julia Lasserre, Abdul-Saboor Sheikh, Evgenii Koriagin, Urs Bergman, Roland Vollgraf, and Reza Shirvany. 2020.
Meta-learning for size and fit recommendation in fashion. In Proceedings of the 2020 SIAM international conference on
data mining. SIAM, 55–63.
[47] Hoyeop Lee, Jinbae Im, Seongwon Jang, Hyunsouk Cho, and Sehee Chung. 2019. Melu: Meta-learned user prefer-
ence estimator for cold-start recommendation. In Proceedings of the 25th ACM SIGKDD International Conference on
Knowledge Discovery & Data Mining. 1073–1082.
[48] Hung-yi Lee, Shang-Wen Li, and Ngoc Thang Vu. 2022. Meta Learning for Natural Language Processing: A Survey.
arXiv preprint arXiv:2205.01500 (2022).
[49] Jingjing Li, Ke Lu, Zi Huang, and Heng Tao Shen. 2021. On Both Cold-Start and Long-Tail Recommendation with
Social Data. IEEE Trans. Knowl. Data Eng. 33, 1 (2021), 194–208.
[50] Jing Li, Pengjie Ren, Zhumin Chen, Zhaochun Ren, Tao Lian, and Jun Ma. 2017. Neural Attentive Session-based
Recommendation. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM,
1419–1428.
[51] Zhao Li, Haobo Wang, Donghui Ding, Shichang Hu, Zhen Zhang, Weiwei Liu, Jianliang Gao, Zhiqiang Zhang, and Ji
Zhang. 2020. Deep Interest-Shifting Network with Meta-Embeddings for Fresh Item Recommendation. Complexity
2020 (2020).
[52] Dongze Lian, Yin Zheng, Yintao Xu, Yanxiong Lu, Leyu Lin, Peilin Zhao, Junzhou Huang, and Shenghua Gao.
2019. Towards fast adaptation of neural architectures with meta learning. In International Conference on Learning
Representations.
[53] Xixun Lin, Jia Wu, Chuan Zhou, Shirui Pan, Yanan Cao, and Bin Wang. 2021. Task-adaptive Neural Process for User
Cold-Start Recommendation. In Proceedings of the Web Conference 2021. 1306–1316.
[54] Yujie Lin, Pengjie Ren, Zhumin Chen, Zhaochun Ren, Dongxiao Yu, Jun Ma, Maarten de Rijke, and Xiuzhen Cheng.
2020. Meta Matrix Factorization for Federated Rating Predictions. In Proceedings of the 43rd International ACM SIGIR
Conference on Research and Development in Information Retrieval. 981–990.
[55] Hanxiao Liu, Karen Simonyan, and Yiming Yang. 2018. DARTS: Differentiable Architecture Search. In International
Conference on Learning Representations.
[56] Jialin Liu, Fei Chao, and Chih-Min Lin. 2020. Task augmentation by rotating for meta-learning. arXiv preprint
arXiv:2003.00804 (2020).
[57] Zhaoyang Liu, Haokun Chen, Fei Sun, Xu Xie, Jinyang Gao, Bolin Ding, and Yanyan Shen. 2020. Intent Preference
Decoupling for User Representation on Online Recommender System.. In IJCAI. 2575–2582.
[58] Yuanfu Lu, Yuan Fang, and Chuan Shi. 2020. Meta-learning on heterogeneous information networks for cold-start
recommendation. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data
Mining. 1563–1573.
[59] Mi Luo, Fei Chen, Pengxiang Cheng, Zhenhua Dong, Xiuqiang He, Jiashi Feng, and Zhenguo Li. 2020. Metaselector:
Meta-learning for recommendation with user-level adaptive model selection. In Proceedings of The Web Conference
2020. 2507–2513.
[60] Shuai Luo, Yujie Li, Pengxiang Gao, Yichuan Wang, and Seiichi Serikawa. 2022. Meta-seg: A survey of meta-learning
for image segmentation. Pattern Recognition (2022), 108586.
[61] Yao Ma, Shilin Zhao, Weixiao Wang, Yaoman Li, and Irwin King. 2022. Multimodality in meta-learning: A compre-
hensive survey. Knowledge-Based Systems (2022), 108976.
[62] Tong Man, Huawei Shen, Xiaolong Jin, and Xueqi Cheng. 2017. Cross-domain recommendation: An embedding and
mapping approach.. In IJCAI, Vol. 17. 2464–2470.
[63] Luke Metz, Niru Maheswaranathan, Brian Cheung, and Jascha Sohl-Dickstein. 2018. Meta-Learning Update Rules for
Unsupervised Representation Learning. In International Conference on Learning Representations.
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
Deep Meta-learning in Recommendation Systems: A Survey 111:37
[64] Nikhil Mishra, Mostafa Rohaninejad, Xi Chen, and Pieter Abbeel. 2018. A Simple Neural Attentive Meta-Learner. In
International Conference on Learning Representations.
[65] Shikhar Murty, Tatsunori B Hashimoto, and Christopher D Manning. 2021. Dreca: A general task augmentation
strategy for few-shot natural language inference. In Proceedings of the 2021 Conference of the North American Chapter
of the Association for Computational Linguistics: Human Language Technologies. 1113–1125.
[66] Krishna Prasad Neupane, Ervine Zheng, Yu Kong, and Qi Yu. 2022. A Dynamic Meta-Learning Model for Time-Sensitive
Cold-Start Recommendations. Genre 2 (2022), 3–0.
[67] Krishna Prasad Neupane, Ervine Zheng, and Qi Yu. 2021. MetaEDL: Meta Evidential Learning For Uncertainty-Aware
Cold-Start Recommendations. In 2021 IEEE International Conference on Data Mining (ICDM). IEEE, 1258–1263.
[68] Wentao Ouyang, Xiuwu Zhang, Shukui Ren, Li Li, Kun Zhang, Jinmei Luo, Zhaojie Liu, and Yanlong Du. 2021. Learning
Graph Meta Embeddings for Cold-Start Ads in Click-Through Rate Prediction. In SIGIR ’21: The 44th International
ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, Canada, July 11-15, 2021,
Fernando Diaz, Chirag Shah, Torsten Suel, Pablo Castells, Rosie Jones, and Tetsuya Sakai (Eds.). ACM, 1157–1166.
https://doi.org/10.1145/3404835.3462879
[69] Feiyang Pan, Shuokai Li, Xiang Ao, Pingzhong Tang, and Qing He. 2019. Warm up cold-start advertisements:
Improving ctr predictions via learning to learn id embeddings. In Proceedings of the 42nd International ACM SIGIR
Conference on Research and Development in Information Retrieval. 695–704.
[70] Haoyu Pang, Fausto Giunchiglia, Ximing Li, Renchu Guan, and Xiaoyue Feng. 2022. PNMTA: A Pretrained Network
Modulation and Task Adaptation Approach for User Cold-Start Recommendation. In Proceedings of the ACM Web
Conference 2022. 348–359.
[71] Danni Peng, Sinno Jialin Pan, Jie Zhang, and Anxiang Zeng. 2021. Learning an Adaptive Meta Model-Generator for
Incrementally Updating Recommender Systems. In Fifteenth ACM Conference on Recommender Systems. 411–421.
[72] Ethan Perez, Florian Strub, Harm De Vries, Vincent Dumoulin, and Aaron Courville. 2018. Film: Visual reasoning
with a general conditioning layer. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.
[73] Ricardo BC Prudêncio and Teresa B Ludermir. 2004. Meta-learning approaches to selecting time series models.
Neurocomputing 61 (2004), 121–137.
[74] Siyuan Qiao, Chenxi Liu, Wei Shen, and Alan L Yuille. 2018. Few-shot image recognition by predicting parameters
from activations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7229–7238.
[75] Guanjin Qu, Huaming Wu, Ruidong Li, and Pengfei Jiao. 2021. Dmro: A deep meta reinforcement learning-based task
offloading framework for edge-cloud computing. IEEE Transactions on Network and Service Management 18, 3 (2021),
3448–3459.
[76] Yanru Qu, Han Cai, Kan Ren, Weinan Zhang, Yong Yu, Ying Wen, and Jun Wang. 2016. Product-based neural networks
for user response prediction. In 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, 1149–1154.
[77] Sachin Ravi and Hugo Larochelle. 2016. Optimization as a model for few-shot learning. (2016).
[78] Yi Ren, Cuirong Chi, and Zhang Jintao. 2019. A Survey of Personalized Recommendation Algorithm Selection Based
on Meta-learning. In The International Conference on Cyber Security Intelligence and Analytics. Springer, 1383–1388.
[79] Steffen Rendle, Christoph Freudenthaler, and Lars Schmidt-Thieme. 2010. Factorizing personalized Markov chains for
next-basket recommendation. In Proceedings of the 19th International Conference on World Wide Web,. ACM, 811–820.
[80] André Luis Debiaso Rossi, André Carlos Ponce de Leon Ferreira, Carlos Soares, Bruno Feres De Souza, et al. 2014.
MetaStream: A meta-learning based method for periodic algorithm selection in time-changing data. Neurocomputing
127 (2014), 52–64.
[81] Andrei A Rusu, Dushyant Rao, Jakub Sygnowski, Oriol Vinyals, Razvan Pascanu, Simon Osindero, and Raia Hadsell.
2018. Meta-Learning with Latent Embedding Optimization. In International Conference on Learning Representations.
[82] Aravind Sankar, Junting Wang, Adit Krishnan, and Hari Sundaram. 2021. ProtoCF: Prototypical Collaborative Filtering
for Few-shot Recommendation. In Fifteenth ACM Conference on Recommender Systems. 166–175.
[83] Adam Santoro, Sergey Bartunov, Matthew Botvinick, Daan Wierstra, and Timothy Lillicrap. 2016. Meta-learning
with memory-augmented neural networks. In International conference on machine learning. PMLR, 1842–1850.
[84] Victor Garcia Satorras and Joan Bruna Estrach. 2018. Few-Shot Learning with Graph Neural Networks. In International
Conference on Learning Representations.
[85] Albert Shaw, Wei Wei, Weiyang Liu, Le Song, and Bo Dai. 2019. Meta architecture search. Advances in Neural
Information Processing Systems 32 (2019).
[86] Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, and Jeff Dean. 2017.
Outrageously large neural networks: The sparsely-gated mixture-of-experts layer. arXiv preprint arXiv:1701.06538
(2017).
[87] Qijie Shen, Hong Wen, Wanjie Tao, Jing Zhang, Fuyu Lv, Zulong Chen, and Zhao Li. 2022. Deep Interest Highlight
Network for Click-Through Rate Prediction in Trigger-Induced Recommendation. In WWW ’22: The ACM Web
Conference 2022. ACM, 422–430.
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
111:38 Chunyang Wang, Yanmin Zhu, Haobing Liu, Tianzi Zang, Jiadi Yu, Feilong Tang
[88] Jake Snell, Kevin Swersky, and Richard Zemel. 2017. Prototypical networks for few-shot learning. Advances in neural
information processing systems 30 (2017).
[89] Jiayu Song, Jiajie Xu, Rui Zhou, Lu Chen, Jianxin Li, and Chengfei Liu. 2021. CBML: A Cluster-based Meta-learning
Model for Session-based Recommendation. In Proceedings of the 30th ACM International Conference on Information &
Knowledge Management. 1713–1722.
[90] Xiaoyuan Su and Taghi M Khoshgoftaar. 2009. A survey of collaborative filtering techniques. Advances in artificial
intelligence 2009 (2009).
[91] Huimin Sun, Jiajie Xu, Kai Zheng, Pengpeng Zhao, Pingfu Chao, and Xiaofang Zhou. 2021. MFNP: A Meta-optimized
Model for Few-shot Next POI Recommendation. In Proceedings of the Thirtieth International Joint Conference on
Artificial Intelligence (IJCAI-21).
[92] Ke Sun, Tieyun Qian, Tong Chen, Yile Liang, Quoc Viet Hung Nguyen, and Hongzhi Yin. 2020. Where to Go Next:
Modeling Long- and Short-Term User Preferences for Point-of-Interest Recommendation. In The Thirty-Fourth AAAI
Conference on Artificial Intelligence. AAAI Press, 214–221.
[93] Xuehan Sun, Tianyao Shi, Xiaofeng Gao, Yanrong Kang, and Guihai Chen. 2021. FORM: Follow the Online Regularized
Meta-Leader for Cold-Start Recommendation. In Proceedings of the 44th International ACM SIGIR Conference on
Research and Development in Information Retrieval. 1177–1186.
[94] Yinan Sun, Kang Yin, Hehuan Liu, Si Li, Yajing Xu, and Jun Guo. 2021. Meta-Learned Specific Scenario Interest
Network for User Preference Prediction. In Proceedings of the 44th International ACM SIGIR Conference on Research
and Development in Information Retrieval. 1970–1974.
[95] Flood Sung, Yongxin Yang, Li Zhang, Tao Xiang, Philip HS Torr, and Timothy M Hospedales. 2018. Learning to
compare: Relation network for few-shot learning. In Proceedings of the IEEE conference on computer vision and pattern
recognition. 1199–1208.
[96] Qiuling Suo, Jingyuan Chou, Weida Zhong, and Aidong Zhang. 2020. Tadanet: Task-adaptive network for graph-
enriched meta-learning. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery &
Data Mining. 1789–1799.
[97] Haining Tan, Di Yao, Tao Huang, Baoli Wang, Quanliang Jing, and Jingping Bi. 2021. Meta-Learning Enhanced Neural
ODE for Citywide Next POI Recommendation. In 2021 22nd IEEE International Conference on Mobile Data Management
(MDM). IEEE, 89–98.
[98] Joaquin Vanschoren. 2018. Meta-learning: A survey. arXiv preprint arXiv:1810.03548 (2018).
[99] Manasi Vartak, Arvind Thiagarajan, Conrado Miranda, Jeshua Bratman, and Hugo Larochelle. 2017. A Meta-Learning
Perspective on Cold-Start Recommendations for Items. Advances in Neural Information Processing Systems 30 (2017),
6904–6914.
[100] Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Daan Wierstra, et al. 2016. Matching networks for one shot
learning. Advances in neural information processing systems 29 (2016).
[101] Risto Vuorio, Shao-Hua Sun, Hexiang Hu, and Joseph J Lim. 2019. Multimodal model-agnostic meta-learning via
task-aware modulation. Advances in Neural Information Processing Systems 32 (2019).
[102] Jianling Wang, Kaize Ding, and James Caverlee. 2021. Sequential Recommendation for Cold-start Users with Meta
Transitional Learning. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in
Information Retrieval. 1783–1787.
[103] Jin Wang, Jia Hu, Geyong Min, Albert Y Zomaya, and Nektarios Georgalas. 2020. Fast adaptive task offloading in
edge computing based on meta reinforcement learning. IEEE Transactions on Parallel and Distributed Systems 32, 1
(2020), 242–253.
[104] Jane X Wang, Zeb Kurth-Nelson, Dhruva Tirumala, Hubert Soyer, Joel Z Leibo, Remi Munos, Charles Blundell,
Dharshan Kumaran, and Matt Botvinick. 2016. Learning to reinforcement learn. arXiv preprint arXiv:1611.05763
(2016).
[105] Li Wang, Binbin Jin, Zhenya Huang, Hongke Zhao, Defu Lian, Qi Liu, and Enhong Chen. 2021. Preference-Adaptive
Meta-Learning for Cold-Start Recommendation. In Proceedings of the Thirtieth International Joint Conference on Artifi-
cial Intelligence, IJCAI-21, Zhi-Hua Zhou (Ed.). International Joint Conferences on Artificial Intelligence Organization,
1607–1614. Main Track.
[106] Xiang Wang, Xiangnan He, Meng Wang, Fuli Feng, and Tat-Seng Chua. 2019. Neural graph collaborative filtering.
In Proceedings of the 42nd international ACM SIGIR conference on Research and development in Information Retrieval.
165–174.
[107] Tianxin Wei, Ziwei Wu, Ruirui Li, Ziniu Hu, Fuli Feng, Xiangnan He, Yizhou Sun, and Wei Wang. 2020. Fast Adaptation
for Cold-start Collaborative Filtering with Meta-learning. In 2020 IEEE International Conference on Data Mining (ICDM).
IEEE, 661–670.
[108] Wei Wei, Chao Huang, Lianghao Xia, Yong Xu, Jiashu Zhao, and Dawei Yin. 2022. Contrastive meta learning with
behavior multiplicity for recommendation. In Proceedings of the Fifteenth ACM International Conference on Web Search
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
Deep Meta-learning in Recommendation Systems: A Survey 111:39
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.
111:40 Chunyang Wang, Yanmin Zhu, Haobing Liu, Tianzi Zang, Jiadi Yu, Feilong Tang
[130] Yaohui Zhu, Chenlong Liu, and Shuqiang Jiang. 2020. Multi-attention Meta Learning for Few-shot Fine-grained
Image Recognition.. In IJCAI. 1090–1096.
[131] Yongchun Zhu, Yudan Liu, Ruobing Xie, Fuzhen Zhuang, Xiaobo Hao, Kaikai Ge, Xu Zhang, Leyu Lin, and Juan Cao.
2021. Learning to Expand Audience via Meta Hybrid Experts and Critics for Recommendation and Advertising. In KDD
’21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, Singapore, August 14-18,
2021, Feida Zhu, Beng Chin Ooi, and Chunyan Miao (Eds.). ACM, 4005–4013. https://doi.org/10.1145/3447548.3467093
[132] Yongchun Zhu, Zhenwei Tang, Yudan Liu, Fuzhen Zhuang, Ruobing Xie, Xu Zhang, Leyu Lin, and Qing He.
2021. Personalized Transfer of User Preferences for Cross-domain Recommendation. CoRR abs/2110.11154 (2021).
arXiv:2110.11154 https://arxiv.org/abs/2110.11154
[133] Yongchun Zhu, Ruobing Xie, Fuzhen Zhuang, Kaikai Ge, Ying Sun, Xu Zhang, Leyu Lin, and Juan Cao. 2021. Learning
to Warm Up Cold Item Embeddings for Cold-start Recommendation with Meta Scaling and Shifting Networks. arXiv
preprint arXiv:2105.04790 (2021).
[134] Ziwei Zhu, Shahin Sefati, Parsa Saadatpanah, and James Caverlee. 2020. Recommendation for new users and new
items via randomized training and mixture-of-experts transformation. In Proceedings of the 43rd International ACM
SIGIR Conference on Research and Development in Information Retrieval. 1121–1130.
[135] Lixin Zou, Long Xia, Yulong Gu, Xiangyu Zhao, Weidong Liu, Jimmy Xiangji Huang, and Dawei Yin. 2020. Neural
interactive collaborative filtering. In Proceedings of the 43rd International ACM SIGIR Conference on Research and
Development in Information Retrieval. 749–758.
J. ACM, Vol. 37, No. 4, Article 111. Publication date: August 2018.