Clarify llm download/loading instructions (apache#25145)

damccorm · web-flow · commit aee2c844ad8f · 2023-02-05T15:55:44.000-05:00
* Clarify llm download/loading instructions

* Fix instructions

* Default model name

* Install beam

* Provide some extra info on LLM
diff --git a/sdks/python/apache_beam/examples/inference/large_language_modeling/main.py b/sdks/python/apache_beam/examples/inference/large_language_modeling/main.py
@@ -88,9 +88,9 @@ def parse_args(argv):
   parser.add_argument(
       "--model_name",
       dest="model_name",
-      required=True,
+      required=False,
       help="Path to the model's state_dict.",
-      default="t5-small",
+      default="t5-11b",
   )
 
   return parser.parse_known_args(args=argv)
diff --git a/website/www/site/content/en/documentation/ml/large-language-modeling.md b/website/www/site/content/en/documentation/ml/large-language-modeling.md
@@ -25,17 +25,31 @@ RunInference works well on arbitrarily large models as long as they can fit on y
 This example demonstrates running inference with a `T5` language model using `RunInference` in a pipeline. `T5` is an encoder-decoder model pre-trained on a multi-task mixture of unsupervised and supervised tasks. Each task is converted into a text-to-text format. The example uses `T5-11B`, which contains 11 billion parameters and is 45 GB in size. In  order to work well on a variety of tasks, `T5` prepends a different prefix to the input corresponding to each task. For example, for translation, the input would be: `translate English to German: …` and for summarization, it would be: `summarize: …`. For more information about `T5` see the [T5 overiew](https://huggingface.co/docs/transformers/model_doc/t5) in the HuggingFace documentation.
 
 ### Run the Pipeline ?
-First, install the required packages listed in [requirements.txt](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/inference/large_language_modeling/requirements.txt) and pass the required arguments. You can download the `T5-11b` model from [Hugging Face Hub](https://huggingface.co/t5-11b) using:
+First, install `apache-beam` 2.40 or greater:
 
-- git lfs install
-- git clone https://huggingface.co/t5-11b
-Note: It will download the checkpoint, then you need to convert it to the model state dict as mentioned [here](https://pytorch.org/tutorials/beginner/saving_loading_models.html#save-load-state-dict-recommended).
+```
+pip install apache-beam -U
+```
+
+Next, install the required packages listed in [requirements.txt](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/inference/large_language_modeling/requirements.txt) and pass the required arguments. You can download the `T5-11b` model from [Hugging Face Hub](https://huggingface.co/t5-11b) with the following steps:
+
+- Install Git LFS following the instructions [here](https://docs.github.com/en/repositories/working-with-files/managing-large-files/installing-git-large-file-storage?platform=mac)
+- Run `git lfs install`
+- Run `git clone https://huggingface.co/t5-11b` (this may take a long time). This will download the checkpoint, then you need to convert it to the model state dict as described [here](https://pytorch.org/tutorials/beginner/saving_loading_models.html#save-load-state-dict-recommended):
+
+```
+import torch
+from transformers import T5ForConditionalGeneration
+
+model = T5ForConditionalGeneration.from_pretrained("path/to/cloned/t5-11b")
+torch.save(model.state_dict(), "path/to/save/state_dict.pth")
+```
 
 You can view the code on [GitHub](https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples/inference/large_language_modeling/main.py)
 
 1. Locally on your machine: `python main.py --runner DirectRunner --model_state_dict_path <local or remote path to state_dict>`. You need to have 45 GB of disk space available to run this example.
-2. On Google Cloud using Dataflow: `python main.py --runner DataflowRunner  --model_state_dict_path <local or remote path to state_dict> --project PROJECT_ID
---region REGION --requirements_file requirements.txt --temp_location GCS_PATH`. Make sure to pass other arguments as mentioned [here](https://cloud.google.com/dataflow/docs/guides/setting-pipeline-options#setting_required_options)
+2. On Google Cloud using Dataflow: `python main.py --runner DataflowRunner --model_state_dict_path <gs://path/to/saved/state_dict.pth> --project <PROJECT_ID>
+--region <REGION> --requirements_file requirements.txt --temp_location <gs://path/to/temp/location> --experiments "use_runner_v2,no_use_multiple_sdk_containers" --machine_type=n2-standard-16`. You can also pass other configuration parameters as described [here](https://cloud.google.com/dataflow/docs/guides/setting-pipeline-options#setting_required_options).
 
 ### Pipeline Steps
 The pipeline contains the following steps: