Skip to content
This repository was archived by the owner on Apr 22, 2023. It is now read-only.

Cluster 2.0 – step 4 : Worker class and isolate internal messages#2388

Closed
AndreasMadsen wants to merge 45 commits intonodejs:masterfrom
AndreasMadsen:cluster2
Closed

Cluster 2.0 – step 4 : Worker class and isolate internal messages#2388
AndreasMadsen wants to merge 45 commits intonodejs:masterfrom
AndreasMadsen:cluster2

Conversation

@AndreasMadsen
Copy link
Copy Markdown
Member

I know this is a big patch, but it is the foundation of the cluster 2.0 and cannot be split without creating a hopeless design.

Documentation:
The documentation can be found here: doc

The important part here is that there now exist a Worker class/object this exist in both workers and clusters and are used in the same way. The worker object is a worker accessed using cluster.worker and in the cluster the object is return from fork and stored in cluster.workers and a callback argument in a lot of functions and events.

Example on worker object:
In the master you send a message to the worker using worker.send. In a worker you send a message to the master using worker.send.

The next important part is the message system, this now isolate internal messages from userland, and support callbacks both ways.

This changes may seam to have nothing to do with each other, but the Worker object use internal messages more and the message system don't make sense without a standardized Worker object.

The API changes are:

  • cluster.fork do no longer extend directly on on child_process.fork. The object return by child_process.fork must now be accessed from cluster.fork().process.
  • worker.kill is renamed to worker.destroy.
  • worker.on('exit') is now worker.on('death')
  • You should not use process.send but cluster.worker.send in the worker.
  • You should not use process.on('message') but cluster.worker.on('message') in the worker.
  • You should not use process.exit but cluster.worker.destroy, the difference is that destroy send a suicide message to the master, this indicate that we won't restart the worker (not implemented yet).
  • You may want to know that the env property CLUSTER_WORKER_ID is now called CLUSTER_UNIQUE_ID, but this was not a part of the official API.
  • The worker.send() has change a lot internal, but should not cause any problems, the old function exist now as worker.process.send()

The API additions are:

  • New events in the cluster object fork, online, listening
  • New events in the worker object online, listening
  • A state property in the worker object worker.state this can be none, online, listening, dead
  • A suicide property in the worker object worker.suicide if undefined the worker is alive, if false the worker died accidently, if true the worker died by purpose.
  • New cluster.eachWorker use it to run a function on every living worker.
  • A uniqueID is assigned to each worker and can be read from worker.workerID.
  • The send function now support callback in both worker and master, the callback will be called when the message is received.

This is about the merging in the cluster 2.0 ( see #2038 ) pull request

Comment thread doc/api/cluster.markdown Outdated
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this change from the patch.

@ry
Copy link
Copy Markdown

ry commented Dec 20, 2011

@visionmedia care to review?

Comment thread doc/api/cluster.markdown Outdated
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a little too specific IMO, just exposing an array would be fine:

cluster.workers.forEach(...

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

workers are assigned to the cluster.worker by there uniqueID, this makes them very easy to find again. If we change this to an array people will get a lot of undefined values in there for loop. Also i don't like to have method like .push() that indicate someting there it very wrong.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well we can splice() to remove those holes

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would destroy the index

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is to remove the eachWorker and then have a getWorkerById function, but I don't think that's a better solution.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe:

cluster.workers = {};

var worker = new cluster.Worker;
cluster.workers[worker.id] = worker;

then people just do:

for (var id in cluster.workers) {
  whatever
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how it works now, cluster.workers are public API (somehow I did not documenting it), the eachWorkers is just a wrapper around it. I say that we make the cluster.eachWorker internal only and make it as an example in the documentation, agree?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im just thinking an array is not the right structure here

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, well wtf haha yeah it's fine as-is then minus eachWorker

@AndreasMadsen
Copy link
Copy Markdown
Member Author

@ry I have done the IPC rewrite but how should userland isolate internal messages.

  • We could make a cluster.isInternalMessage(msg) but that's a nightmare to remember.
  • The next option is to tell userland to use cluster.worker.on('message') but we are getting away with that right?.
  • The last option, is to modify the core to filter all message containing a NODE_ prefix in its cmd property, and relay this messages to an internal event. I think is the the best option, but I'm not happy. It add unnecessary complexity to a good workeing API.

@AndreasMadsen
Copy link
Copy Markdown
Member Author

Also i have not merged the changes with this patch, because of the internal message filtering issue.

@ry
Copy link
Copy Markdown

ry commented Dec 28, 2011

Another option is to leak the messages out to the user.

The last option, is to modify the core to filter all message containing a NODE_ prefix in its cmd property, and relay this messages to an internal event. I think is the the best option, but I'm not happy. It add unnecessary complexity to a good workeing API.

This seems good.

Are the callbacks removed?

Next item for debate is cluster.worker. I want the workers to be able to do process.send() not cluster.worker.send.

@AndreasMadsen
Copy link
Copy Markdown
Member Author

@ry callbacks are removed on this patch, and process.send is supported in an experimental patch, i will merge it with this when i have modified process.on('message') and updated API doc. Thanks for the reply.

@AndreasMadsen
Copy link
Copy Markdown
Member Author

Okay this is a little noisy, so i will just tell what has changed :)

child_process:
The process.on('message') now filter out all messages with a _NODE prefix in there cmd property, and relay them to process.on('internalMessage'). The child_process documentation has been updated to tell users to avoid use of _NODE prefix and internalMessage event.

cluster:
The cluster.js core has been updated to depend on the message filter. This allow users to use process.send() and process.on('message') without any problems or internal message noise. Furthermore the cluster documentation has been updated, so it now tells users that they can use this methods/events. The old cluster.worker.send and cluster.worker.send('message') do still exist, but are now just simpel relays. They need to exist because the Worker class is isomorphic (used in both master and worker) and the master can't use its process object to send messages to workers.

@ry ^ that is what you requested, so lets move to the next next debate item :)

@AndreasMadsen
Copy link
Copy Markdown
Member Author

Ups the perfix is not _NODE but NODE_

Comment thread doc/api/cluster.markdown Outdated
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 space indent

@AndreasMadsen
Copy link
Copy Markdown
Member Author

This should fix it all, and testcases do parse :)

EDIT:
I did not fix the cluster.master.send thing, it confuse me.

@AndreasMadsen
Copy link
Copy Markdown
Member Author

@ry please make a final review and land this patch on master.

@ry ry closed this in 5f08c3c Jan 5, 2012
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants