GitHub

Customizing In-context Learning for Real-time Recommendation with Large Language Models

This repository is constructed based on MiniGPT-4!

Introduction

we introduce RecICL, a method designed to customize recommendation-specific ICL(In-context Learning) in LLMs for real-time recommendation. To be specific, we can construct RecICL based on two methods, with collaborative information or without collaborative information. For RecICL without collaborative information, we directly use users' history sequences to construct samples. For RecICL with collaborative information, we additionally inject user embeddings and item embeddings, which come from hashGNN, a traditional collaborative filter model.

Getting Started

Installation

1. Prepare the code and the environment

Download this anonymous , unzip it, create a python environment and activate it via the following command:

cd RecICL-8003
conda env create -f environment.yml
conda activate minigpt4

Code Structure

├──minigpt4: Core code of RecICL, following the structure of MiniGPT-4.
    ├── models: Defines our RecICL model architecture.
    ├── datasets: Defines dataset classes.
    ├── task: A overall task class, defining the used model and datasets, training epoch and evaluation.
    ├── runners: A runner class to train and evaluate a model based on a task.
    ├── common: Commonly used functions.
├──dataset: Dataset pre-processing.
├──RecICL-datasets: Processed Amazon-book and Amazon-movies datasets.
├──prompt: Used prompts.
├──train_configs: Training configuration files, setting hyperparameters.
├──train_hash_cf4rec.py: RecICL training file.
├──hashGNN.py: HashGNN training file.
├──baseline_train_xx.py: Baseline training file.

Note: The folder train_configs currently contains only a template file with detailed explanations for each parameter. Please create your own YAML file based on the template and place it in this folder.

2. Prepare the pretrained Qwen weights

We used Qwen1.5-0.5b for our experiments, but you can choose any similar model (e.g., llama3 or Qwen2.5). Please specify the model path under the qwen_model parameter in the YAML file.

3. Prepare the Datasets

We have placed the two datasets we used in the RecICL-datasets folder. If you need to preprocess your own dataset, you can also use the code in the ./datasets folder.

Training

If you are using RecICL without injecting collaborative information, then after completing the above preparations, use the following command to start training:

CUDA_VISIBLE_DEVICES=6,7 WORLD_SIZE=2 nohup torchrun --nproc-per-node 2 --master_port=11139 train_hash_cf4rec.py --cfg-path=train_configs/your_config.yaml > /your/log/path/log.out &

Note: In the example above, we used two GPUs, but you can change the number of GPUs according to your needs.

If you need to inject collaborative information, you will need to first complete the model training above to obtain a checkpoint. Then, based on that checkpoint, perform a second stage of training using input text with collaborative information injected. The training command is the same as above, but you need to modify the YAML file. The specific parameters that need modification are annotated in the template YAML file.

Evaluation

Set the hyper-parameters in the training config file as follows:

- ckpt: your_checkpoint_path # trained model path
- evaluate: True # only evaluate

Then run the same command to the first stage training.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.vscode		.vscode
RecICL-datasets		RecICL-datasets
__pycache__		__pycache__
dataset		dataset
minigpt4		minigpt4
prompts		prompts
train_configs		train_configs
README.md		README.md
baseline_train_mf_ood.py		baseline_train_mf_ood.py
baseline_train_sasrec.py		baseline_train_sasrec.py
environment.yml		environment.yml
hashGNN.py		hashGNN.py
requirements.txt		requirements.txt
train_hash_cf4rec.py		train_hash_cf4rec.py
used_metrics.py		used_metrics.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Customizing In-context Learning for Real-time Recommendation with Large Language Models

Introduction

Getting Started

Installation

1. Prepare the code and the environment

2. Prepare the pretrained Qwen weights

3. Prepare the Datasets

Training

Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

ym689/rec_icl

Folders and files

Latest commit

History

Repository files navigation

Customizing In-context Learning for Real-time Recommendation with Large Language Models

Introduction

Getting Started

Installation

1. Prepare the code and the environment

2. Prepare the pretrained Qwen weights

3. Prepare the Datasets

Training

Evaluation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages