CUDA Memory Error in PyTorch Script

The documents discuss using Python scripts to fine tune a Vietnamese language model using the HuggingFace library. It includes commands for cloning models from HuggingFace, running training scripts on GPU servers, and exporting a fine tuned model.

Uploaded by

quyên nguyễn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views3 pages

CUDA Memory Error in PyTorch Script

Uploaded by

quyên nguyễn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

python3 rec.

py --base_model 'Viet-Mistral/Vistral-7B-Chat' --data_path

'./data/books.json' --output_dir './checkpoint'

git clone
https://longcule123:[email protected]/Viet-
Mistral/Vistral-7B-Chat

wget https://longcule123:[email protected]/longcule123/vi_book_data/
resolve/main/shuffled_data_out.json

"torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 8.00 GiB. GPU 0

has a total capacity of 14.58 GiB of which 3.12 GiB is free. Including non-PyTorch
memory, this process has 11.45 GiB memory in use. Of the allocated memory 11.20 GiB
is allocated by PyTorch, and 144.70 MiB is reserved by PyTorch but unallocated. If
reserved but unallocated memory is large try setting
PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See
documentation for Memory Management
(https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
0%| " lỗi gì đây

git clone
https://fubaoineverland:[email protected]/bkai-
foundation-models/vietnamese-llama2-7b-40GB

scp -r -oProxyCommand="ssh -W %h:%p [email protected]" LightGBM

[email protected]:/data2/Dang_DHCN/projects/longvv

#!/bin/bash
python src/cli_demo.py \
--model_name_or_path './vistral' \
--adapter_name_or_path './lora' \
--template default \
--finetuning_type lora

#!/bin/bash

CUDA_VISIBLE_DEVICES=0 python ../../src/train_bash.py \

--stage sft \
--do_train \
--model_name_or_path './vistral-model' \
--dataset ./data/books.json \
--dataset_dir ../../data \
--template default \
--finetuning_type lora \
--lora_target q_proj,v_proj \
--output_dir ../../saves/vistral/lora/sft \
--overwrite_cache \
--overwrite_output_dir \
--cutoff_len 1024 \
--per_device_train_batch_size 64 \
--per_device_eval_batch_size 64 \
--gradient_accumulation_steps 8 \
--lr_scheduler_type cosine \
--logging_steps 10 \
--save_steps 100 \
--eval_steps 100 \
--evaluation_strategy steps \
--load_best_model_at_end \
--learning_rate 5e-5 \
--num_train_epochs 3.0 \
--max_samples 3000 \
--val_size 0.1 \
--plot_loss \
--fp16

#!/bin/bash
#SBATCH --job-name=personReid # define job name
#SBATCH --nodes=1 # define node
#SBATCH --gpus-per-node=1 # define gpu limmit in 1 node
#SBATCH --ntasks=1 # define number tasks
#SBATCH --cpus-per-task=24 # There are 24 CPU cores
#SBATCH --time=2-00:00:00 # Max running time = 10 minutes
#SBATCH --output="slurm-result/slurm-%j-%x.out"

nvidia-smi
echo "-----------------------------"
echo "## Print Python and cuda"

# Load module
# Some module avail:
## pytorch-extra-py39-cuda11.2-gcc9
## tensorflow2-extra-py39-cuda11.2-gcc9
## horovod-pytorch-py39-cuda11.2-gcc9
## horovod-tensorflow2-py39-cuda11.2-gcc9
## xgboost-py39-cuda11.2-gcc9
## fastai2-py39-cuda11.2-gcc9

#module load pytorch-extra-py39-cuda11.2-gcc9

# ACTIVATE ANACONDA
eval "$(conda shell.bash hook)"
conda activate longfinetune
python3 --version
sh ./run.sh

echo "------------PIP LIST-----------"

python3 -m pip list

echo "-----------------------------"
echo "SLURM_GPUS_ON_NODE=$SLURM_GPUS_ON_NODE"
echo "SLURM_GPUS_PER_NODE=$SLURM_GPUS_PER_NODE"
echo "SLURM_JOB_GPUS=$SLURM_JOB_GPUS"

echo "-----------------------------"
echo "Exit worker node"

3f9b45f4a5bb49fcc71427dbbc7e5b196eccc685
aa4154856b62393aed6f02643db5db5b1912a361
8a03185bb147440aaba6458866728473c40e0aff
9f768cbe57d833c8de44b9d59bf6d0b3c2fa8f7e

"books_vi": {
"file_name": "books.json",
"file_sha1": "b258b9d8e5abdd17a075d2abcffa0512455c0b0e",
"ranking": true
}

python src/export_model.py \
--model_name_or_path './vistral-model' \
--adapter_name_or_path './saves/vistral/lora/sft' \
--template default \
--finetuning_type lora \
--export_dir './model-out' \
--export_size 2 \
--export_legacy_format False

111
No ratings yet
111
3 pages
Cuda Fix For Python
No ratings yet
Cuda Fix For Python
2 pages
Htrhgrgefgwfewf
No ratings yet
Htrhgrgefgwfewf
3 pages
Video Retalking Setup & Execution
No ratings yet
Video Retalking Setup & Execution
1 page
App Log 2024-03-14
No ratings yet
App Log 2024-03-14
3 pages
Logs
No ratings yet
Logs
3 pages
Setup
No ratings yet
Setup
3 pages
DL 1 - ComputerVision With PyTorch Notes
No ratings yet
DL 1 - ComputerVision With PyTorch Notes
304 pages
Pytorch Performance Tuning Guide: Szymon Migacz, 04/12/2021
No ratings yet
Pytorch Performance Tuning Guide: Szymon Migacz, 04/12/2021
20 pages
MNIST Model Training with TensorFlow
No ratings yet
MNIST Model Training with TensorFlow
3 pages
Logs
No ratings yet
Logs
2 pages
Difficulties Faced
No ratings yet
Difficulties Faced
9 pages
Pytorch Tutorial
0% (1)
Pytorch Tutorial
65 pages
Difficulties
No ratings yet
Difficulties
9 pages
Set Up Python Env
No ratings yet
Set Up Python Env
8 pages
Tutorials Sources Beginner Ptcheat
No ratings yet
Tutorials Sources Beginner Ptcheat
7 pages
Fast Llama Training Guide
No ratings yet
Fast Llama Training Guide
5 pages
Pytorch
No ratings yet
Pytorch
38 pages
Fewfwefwefsdfsdf
No ratings yet
Fewfwefwefsdfsdf
2 pages
Install Transformers and Torch
No ratings yet
Install Transformers and Torch
4 pages
NB4-06 PT I Using CNN
No ratings yet
NB4-06 PT I Using CNN
21 pages
Training Error
No ratings yet
Training Error
3 pages
Fooocus GitHub Setup Instructions
No ratings yet
Fooocus GitHub Setup Instructions
1 page
M1 L3 RR1 Keras - Tuner - Ipynb - Colab
No ratings yet
M1 L3 RR1 Keras - Tuner - Ipynb - Colab
8 pages
RL 3
No ratings yet
RL 3
8 pages
PyTorch Cheat Sheet
No ratings yet
PyTorch Cheat Sheet
2 pages
Python Dev: Install Requirements
No ratings yet
Python Dev: Install Requirements
3 pages
Docs VLLM Ai en v0.6.1
No ratings yet
Docs VLLM Ai en v0.6.1
215 pages
Week 2
No ratings yet
Week 2
4 pages
Task VIII Quantum Vision Transformer
No ratings yet
Task VIII Quantum Vision Transformer
1 page
PLC - Codegen - Codelama - Ipynb - Colab
No ratings yet
PLC - Codegen - Codelama - Ipynb - Colab
6 pages
Roop Unleashed 02.ipynb
No ratings yet
Roop Unleashed 02.ipynb
15 pages
C Make Lists
No ratings yet
C Make Lists
11 pages
English To Hindi Text Translation
No ratings yet
English To Hindi Text Translation
10 pages
Documentation
No ratings yet
Documentation
12 pages
PyTorch Basics: A Quick Guide
No ratings yet
PyTorch Basics: A Quick Guide
19 pages
Major Project
No ratings yet
Major Project
144 pages
Mol Genn
No ratings yet
Mol Genn
1,296 pages
Idlcv Exercise 2 1 HPC
No ratings yet
Idlcv Exercise 2 1 HPC
4 pages
Alpaca + Llama-3 8b Full Example - Ipynb - Colab
No ratings yet
Alpaca + Llama-3 8b Full Example - Ipynb - Colab
10 pages
Eror
No ratings yet
Eror
4 pages
10fold Split70
No ratings yet
10fold Split70
5 pages
Loss Log Mix
No ratings yet
Loss Log Mix
87 pages
CMD Output
No ratings yet
CMD Output
157 pages
Ntpose Draft 20
No ratings yet
Ntpose Draft 20
1 page
Week 3 Day 1
No ratings yet
Week 3 Day 1
8 pages
Deep Learning With PyTorch: Object Classification - Filliat Et Al
No ratings yet
Deep Learning With PyTorch: Object Classification - Filliat Et Al
3 pages
Clip
No ratings yet
Clip
8 pages
Test Gpu Acceleration Pythonl
No ratings yet
Test Gpu Acceleration Pythonl
1 page
PyTorch Image Inference Script
No ratings yet
PyTorch Image Inference Script
3 pages
Object Detection
No ratings yet
Object Detection
10 pages
Báo Cáo Dùng Tensorrt Trên Migan Model
No ratings yet
Báo Cáo Dùng Tensorrt Trên Migan Model
6 pages
Ex 6
No ratings yet
Ex 6
7 pages
Linux
No ratings yet
Linux
7 pages
GitHub - FRGFM - Torch-Scan - Seamless Analysis of Your PyTorch Models (RAM Usage, FLOPs, MACs, Receptive Field, Etc.)
No ratings yet
GitHub - FRGFM - Torch-Scan - Seamless Analysis of Your PyTorch Models (RAM Usage, FLOPs, MACs, Receptive Field, Etc.)
6 pages
Infinite Prep Guide
No ratings yet
Infinite Prep Guide
5 pages
Document
No ratings yet
Document
4 pages
2025-04-27 Biz Main
No ratings yet
2025-04-27 Biz Main
4 pages
Secretarial Typing
No ratings yet
Secretarial Typing
3 pages
Cs205 Information Security Transformation
No ratings yet
Cs205 Information Security Transformation
935 pages
Troubleshooting 002 E v011
No ratings yet
Troubleshooting 002 E v011
5 pages
Mobile Application Prototyping With Python For S60: Bernhard Famler, BSC
No ratings yet
Mobile Application Prototyping With Python For S60: Bernhard Famler, BSC
8 pages
Cibil of Anil Kumar
No ratings yet
Cibil of Anil Kumar
1 page
RTN 950 Configuration Guide Weblct
No ratings yet
RTN 950 Configuration Guide Weblct
1,807 pages
Welding Comment Resolution Sheet
No ratings yet
Welding Comment Resolution Sheet
4 pages
IENA 2020 Annales Concours Anglais
No ratings yet
IENA 2020 Annales Concours Anglais
82 pages
Annexure Ii - PH Ix - 2 DTR and Sor
100% (1)
Annexure Ii - PH Ix - 2 DTR and Sor
281 pages
Programming in ANSI C - Balagurusamy (Solutions With Flowchart & Programs)
75% (40)
Programming in ANSI C - Balagurusamy (Solutions With Flowchart & Programs)
227 pages
Designing High Power Efficient Finite Impulse Response Filters With Three-Four Inexact Adder-Integrated Booth Multiplier
No ratings yet
Designing High Power Efficient Finite Impulse Response Filters With Three-Four Inexact Adder-Integrated Booth Multiplier
10 pages
PDF Udyam Annexure Certificate Online
No ratings yet
PDF Udyam Annexure Certificate Online
2 pages
Digital Marketing for Scholars
No ratings yet
Digital Marketing for Scholars
59 pages
Summary CS 141
No ratings yet
Summary CS 141
9 pages
Complete Python Programs
No ratings yet
Complete Python Programs
89 pages
Computer Equipment Installation Guide
No ratings yet
Computer Equipment Installation Guide
25 pages
Instructions NP EX365 CT2C
No ratings yet
Instructions NP EX365 CT2C
4 pages
Matrix Eternity Genx Epab
No ratings yet
Matrix Eternity Genx Epab
12 pages
Microprocessor Computer ArchitectureCACS155
No ratings yet
Microprocessor Computer ArchitectureCACS155
7 pages
Hydra Steps
No ratings yet
Hydra Steps
2 pages
Computer Assembly Basics
No ratings yet
Computer Assembly Basics
34 pages
Rohit's Resume
No ratings yet
Rohit's Resume
1 page
CK500 Sliding Gate Opener Manual
No ratings yet
CK500 Sliding Gate Opener Manual
20 pages
PreschoolAssessmentPack 1
No ratings yet
PreschoolAssessmentPack 1
9 pages
Displaying Notifications
No ratings yet
Displaying Notifications
31 pages
Control Engineering Simplified
No ratings yet
Control Engineering Simplified
23 pages
(Ebook PDF) AutoCAD and Its Applications Comprehensive 2018 Twenty Fifth Edition PDF Download
100% (1)
(Ebook PDF) AutoCAD and Its Applications Comprehensive 2018 Twenty Fifth Edition PDF Download
51 pages

CUDA Memory Error in PyTorch Script

Uploaded by

CUDA Memory Error in PyTorch Script

Uploaded by

python3 rec.

py --base_model 'Viet-Mistral/Vistral-7B-Chat' --data_path

"torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 8.00 GiB. GPU 0

scp -r -oProxyCommand="ssh -W %h:%p [email protected]" LightGBM

CUDA_VISIBLE_DEVICES=0 python ../../src/train_bash.py \

#module load pytorch-extra-py39-cuda11.2-gcc9

echo "------------PIP LIST-----------"

You might also like