Skip to content

This is the docker image for gesen2egee/MoE-LLaVA-hf, a script that uses MoE-LLaVA to describe images. It is designed to prepare the training set caption for stable diffusion model training. (Dockerfile, CI image build)

License

Notifications You must be signed in to change notification settings

jim60105/docker-MoE-LLaVA

Repository files navigation

docker-MoE-LLaVA

This is the docker image for gesen2egee/MoE-LLaVA-hf, a script that uses MoE-LLaVA to describe images. It is designed to prepare the training set caption for stable diffusion model training.

Get the Dockerfile at GitHub, or pull the image from ghcr.io.

🚀 Get your Docker ready for GPU support

Windows

Once you have installed Docker Desktop, CUDA Toolkit, NVIDIA Windows Driver, and ensured that your Docker is running with WSL2, you are ready to go.

Here is the official documentation for further reference.
https://docs.nvidia.com/cuda/wsl-user-guide/index.html#nvidia-compute-software-support-on-wsl-2 https://docs.docker.com/desktop/wsl/use-wsl/#gpu-support

Linux, OSX

Install an NVIDIA GPU Driver if you do not already have one installed.
https://docs.nvidia.com/datacenter/tesla/tesla-installation-notes/index.html

Install the NVIDIA Container Toolkit with this guide.
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

📦 Available Pre-built Image

You can pull the pre-build image which does not include the models from the GitHub Container Registry.
These images will download the models at runtime.

Mount the current directory as /dataset and run the script with additional input arguments.

Important

Remember to prepend -- before the arguments.

docker run --gpus all -it -v ".:/dataset" ghcr.io/jim60105/moe-llava:no_model -- [arguments]
# Example
docker run --gpus all -it -v ".:/dataset" ghcr.io/jim60105/moe-llava:no_model -- --moe --force --caption_style='mixed' --folder_name --modify_prompt --low_vram

The [arguments] placeholder should be replaced with the arguments for the script. Check the original colab notebook for more information.

⚡️ Preserve the download cache for the models

You can mount the /.cache to share model caches between containers.
In this way, they will not be repeatedly downloaded every time when image start.

docker run --gpus all -it -v ".:/dataset" -v "moe_cache:/.cache" ghcr.io/jim60105/moe-llava:no_model -- --moe --force --caption_style='mixed' --folder_name --modify_prompt --low_vram

🛠️ Building the Image include models

Caution

These models are extremely big! They inflate the image size to a whopping 40GB 😕
It is too time-consuming to build and I suggest avoiding it.
Please use the no_model image and attaching the /.cache volume as instructed earlier.
image

Important

Clone the Git repository recursively to include submodules:
git clone --recursive https://github.com/jim60105/docker-MoE-LLaVA.git

You can build the image which includes the models by targeting to the final stage.
Use the LOW_VRAM build argument and to choose the model to preload.

  • (No build-arg): Preload the LanguageBind/MoE-LLaVA-Phi2-2.7B-4e model.
  • LOW_VRAM=1: Preload the LanguageBind/MoE-LLaVA-StableLM-1.6B-4e-384 model.
docker build -t moe-llava --target final --build-arg LOW_VRAM=1 .

📝 LICENSE

Note

The main program, PKU-YuanGroup/MoE-LLaVA and the predict script, is distributed under Apache License 2.0.
Please consult their repository for access to the source code and licenses.
The following is the license for the Dockerfiles and CI workflows in this repository.

gplv3

GNU GENERAL PUBLIC LICENSE Version 3

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.

Caution

A GPLv3 licensed Dockerfile means that you MUST distribute the source code with the same license, if you

  • Re-distribute the image. (You can simply point to this GitHub repository if you doesn't made any code changes.)
  • Distribute a image that uses code from this repository.
  • Or distribute a image based on this image. (FROM ghcr.io/jim60105/moe-llava in your Dockerfile)

"Distribute" means to make the image available for other people to download, usually by pushing it to a public registry. If you are solely using it for your personal purposes, this has no impact on you.

Please consult the LICENSE for more details.

About

This is the docker image for gesen2egee/MoE-LLaVA-hf, a script that uses MoE-LLaVA to describe images. It is designed to prepare the training set caption for stable diffusion model training. (Dockerfile, CI image build)

Topics

Resources

License

Stars

Watchers

Forks

Packages