Skip to content

Initial work on using multiprocessing SharedMemory feature#19

Merged
ymd-h merged 4 commits intoymd-h:masterfrom
jamartinh:Feature_shared_memory
Feb 26, 2022
Merged

Initial work on using multiprocessing SharedMemory feature#19
ymd-h merged 4 commits intoymd-h:masterfrom
jamartinh:Feature_shared_memory

Conversation

@jamartinh
Copy link
Copy Markdown
Contributor

It is working well, tested on Ray, Linux.

It is working well, tested on Ray, Linux.
@ymd-h
Copy link
Copy Markdown
Owner

ymd-h commented Feb 20, 2022

@jamartinh
Great thanks!
Your contribution is very helpful.

Can you make a fallback for Python 3.7?
Otherwise from multiprocessing import shared_memory will raise ImportError.

I think SyncManager-based Event and Lock should be optional, these are more expensive so that it is better to use only when users specify.

It seems that SyncManager is not shutdowned.
Does the process is killed correctly at the termination of Python interpreter?


Also I've set up unit test GitHub Actions for PR. From the next update, maybe it will run, too.

@ymd-h
Copy link
Copy Markdown
Owner

ymd-h commented Feb 20, 2022

Additional comments;

  • Fixed simple auth_key is questionable. How about using os.urandom() as the default configuration does?
    • Someone might run multiple set of learner-explorers at the single strong server. It it better to avoid conflict.
  • Value constructor can accept Lock / RLock, you don't need to hold additional explorer_count_lock.
    • Moreover, if we use SyncManager.Value, we don't need additional Lock, right?

@jamartinh
Copy link
Copy Markdown
Contributor Author

Hi @ymd-h I will check all this details ! This is a preliminary version to show you how it is going.
I will "refine" the things you point out.

@ymd-h
Copy link
Copy Markdown
Owner

ymd-h commented Feb 20, 2022

@jamartinh
Thank you for your support.

I also would like to discuss another point;

Modifying auth_key has side effect, which might break other server processes.
So that I think it is better that users set auth_key explicitly outside buffer as you mentioned at the discussion.

import base64
import ctypes
import multiprocessing as mp
from multiprocessing.managers import SyncManager
import time

import ray

def run():
    ray.init()

    m = SyncManager()
    m.start()

    v = m.Value(ctypes.c_int, 0)
    lock = m.Lock()
    authkey = mp.current_process().authkey

    # Encode base64 to avoid following error:
    #   TypeError: Pickling an AuthenticationString object is disallowed for security reasons
    encoded = base64.b64encode(authkey)

    def auth_fn(*args):
        mp.current_process().authkey = base64.b64decode(encoded)

    ray.worker.global_worker.run_function_on_all_workers(auth_fn)

    @ray.remote
    def remote_lock(v, L):
        with L:
            v.value += 1

    try:
        print(v.value) # -> 0
        ray.get([remote_lock.remote(v, lock),
                 remote_lock.remote(v, lock)])
        print(v.value) # -> 2
    finally:
        m.shutdown()


if __name__ == "__main__":
    run()

PS1: I was wrong. SyncManager.Value cannot hold Lock object. (ref)

PS2: If you feel our request is too much, please let us know. We can take over at any time.

@jamartinh
Copy link
Copy Markdown
Contributor Author

@ymd-h thanks !
BTW, do you think it is better to have ShmMPReplay buffer and ShmMPriority ... hence a different class for this feature or do you prefer having the Sharedmemory option just by parameters in the constructors?

+ fix to don't change authkey
+ clean source code
+ clean pickling functions set/get state
@jamartinh
Copy link
Copy Markdown
Contributor Author

@ymd-h thanks,

Now it is just:

authkey = mp.current_process().authkey
encoded = base64.b64encode(authkey)

def auth_fn(*args):
    mp.current_process().authkey = base64.b64decode(encoded)
    print("multiprocessing authkey set")

ray.worker.global_worker.run_function_on_all_workers(auth_fn)

And no more issues with authkey inside nor dangerous side effects on a node by changing the authkey

Thanks for the tips!

@jamartinh
Copy link
Copy Markdown
Contributor Author

it rest the fallback for python<3.8

@ymd-h
Copy link
Copy Markdown
Owner

ymd-h commented Feb 21, 2022

@jamartinh
Great thanks.

BTW, do you think it is better to have ShmMPReplay buffer and ShmMPriority ... hence a different class for this feature or do you prefer having the Sharedmemory option just by parameters in the constructors?

I think constructor option is enough unless there are non-negligible overhead at add / sample etc.

Unfortunately, it seems that it doesn't work on Windows and macOS.

…Windows is that Windows only accepts spawn (not fork) so classes have to support explicitly pickling and unpickling all the objects, especially the synchronized and shared memory ones.

I have run the test locally on a Windows 2016 server python3.9 and mp.py tests succeeded.
@ymd-h
Copy link
Copy Markdown
Owner

ymd-h commented Feb 21, 2022

@jamartinh
Did you refactor import sections?
Please don't let some tools refactor them. The order is intentional.

If you have any ideas for import section, please open a new discussion thread.

Especially, removing cimport numpy as np (just before import numpy as np) is not acceptable.

@jamartinh
Copy link
Copy Markdown
Contributor Author

@ymd-h It seems my fingers automatically pressed the code formatter because I use it all time, yes it is dangerous since it optimizes imports and do many other things.

@jamartinh
Copy link
Copy Markdown
Contributor Author

Well, after all, I think I can't be dedicating more time to this.
I am not a software engineer, I am just a research scientist.
thanks @ymd-h for paying attention to the need of using it in ray or any other no forking tool.
All the best,
JAMH

@ymd-h
Copy link
Copy Markdown
Owner

ymd-h commented Feb 22, 2022

@jamartinh
Thank you for your great contribution.
We will take over this work and and try to distribute as soon as possible.

I'm sorry if was being impolite or too straightforward in our discussion.
I would be happy if you continue to give us feedbacks.

I hope for your success.

Best wishes.

@ymd-h ymd-h merged commit 1f0f5d8 into ymd-h:master Feb 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants