Skip to content

jeongeunnn-e/HIPPO-Video

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HIPPO logo HIPPO-Video Simulation Framework

This repository contains the official code for generating YouTube watch histories using an LLM-based user simulator.

📖 Proposed in the COLM 2025 paper:
HIPPO-VIDEO: Simulating Watch Histories with Large Language Models for History-Driven Video Highlighting

You can access the full dataset generated by this simulation pipeline on Hugging Face.


🚀 Getting Started

1. Clone and set up environment

git clone https://github.com/jeongeunnn-e/HIPPO-Video.git
cd HIPPO-Video

# Conda (recommended)
conda create -n hippo python=3.10 -y
conda activate hippo
pip install -r requirements.txt

  1. Prepare config and seed data

You need to provide a configuration and input data in JSON format. We include an example seed file: seed_data.json.

Example: config.json

{
  "data_path": "your_path/seed_data.json",
  "save_path": "your_path/outputs/",
  "donwload_path": "your_path/downloads/",
  "model_name": "gpt-4o",
  "max_length": 10,
  "OPENAI_API_KEY": "your_openai_key"
}

📑 Example: seed_data.json

[
  {
    "topic": "Clothes",
    "sub_topic": "Shoes",
    "feature": "informative",
    "initial_query": "how shoes are made from start to finish"
  },
  {
    "topic": "Music",
    "sub_topic": "Jazz",
    "feature": "emotional",
    "initial_query": "best emotional jazz solos"
  }
]

You can include multiple seeds to generate multiple simulated sessions.

  1. Run simulation
python run.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages