Skip to content

Parallel Python execution for tool completion#470

Merged
yunfeng-scale merged 10 commits intomainfrom
yunfeng-parallel-python
Apr 5, 2024
Merged

Parallel Python execution for tool completion#470
yunfeng-scale merged 10 commits intomainfrom
yunfeng-parallel-python

Conversation

@yunfeng-scale
Copy link
Copy Markdown
Contributor

@yunfeng-scale yunfeng-scale commented Mar 13, 2024

Pull Request Summary

with some sample data (1000 prompts). on my devbox (96 CPU cores), but since i used threadpool to start subprocess i think it still might be slowed down by GIL to start processes, probably not high util for the 96 cores

467s total, 283s tool use
vs
without this change 862s total, 687s tool use

Test Plan and Usage Guide

tested with sample data
also CPU count works inside and outside container

@yunfeng-scale yunfeng-scale requested a review from a team March 13, 2024 19:15
gen_item.completed = True
continue

pool = ThreadPool(CPU_COUNT)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's probably fine to use single process to start subprocesses when CPU number is low

@yunfeng-scale yunfeng-scale requested a review from Georgepu1 March 13, 2024 19:18
Copy link
Copy Markdown

@Georgepu1 Georgepu1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks for investigating! btw were u able to validate the completions were close enough on the "467s total, 283s tool use" vs "862s total, 687s tool use" runs? just want to make sure nothing weird's happening here on the generation side

@yunfeng-scale
Copy link
Copy Markdown
Contributor Author

lgtm, thanks for investigating! btw were u able to validate the completions were close enough on the "467s total, 283s tool use" vs "862s total, 687s tool use" runs? just want to make sure nothing weird's happening here on the generation side

there are actually differences, but for a hand-picked sample i can't repro either for some reason. i wonder if this is due to some tiny randomness in vLLM, or python execution. will investigate a bit more

@yunfeng-scale
Copy link
Copy Markdown
Contributor Author

took a bit more look into this @Georgepu1 15/1000 output texts have changed, i manually checked 5 of them and i don't think python code execution results are different. the differences appear to be some randomness in sampling that sentences in both cases made sense to me

@yunfeng-scale yunfeng-scale requested a review from yixu34 April 5, 2024 18:34
@yunfeng-scale yunfeng-scale requested a review from yixu34 April 5, 2024 20:50
@yunfeng-scale yunfeng-scale merged commit c46162a into main Apr 5, 2024
@yunfeng-scale yunfeng-scale deleted the yunfeng-parallel-python branch April 5, 2024 23:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants