-
Notifications
You must be signed in to change notification settings - Fork 1.8k
support explicit build order #1333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/build.cc
Outdated
|
|
||
| void Plan::ComputePriorityList(BuildLog* build_log) { | ||
|
|
||
| for (vector<Node*>::iterator it=targets_.begin();it!=targets_.end();++it){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space after ';'?
|
|
||
|
|
||
| // Dump graph to stdout for debugging / prototyping. | ||
| if (false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need this if?
| // Critical path scheduling. | ||
| // 0. Assign costs to all edges, using: | ||
| // a) The time the edge needed last time, if available. | ||
| // b) The average time this edge type needed, if this edge hasn't run before. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
b) is implemented?
src/build.cc
Outdated
| continue; | ||
| } | ||
| edgesQ.push(first); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
insert first to done here?
src/build.cc
Outdated
|
|
||
| uint64_t prev_top_crit=0; | ||
|
|
||
| for (vector<Node*>::reverse_iterator tit = targets_.rbegin(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you improve order of this algorithm?
It looks like O(|targets| * |build edge|), it can be long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for my inital idea, earlier built target should uses previous pass max critical time as inital value, so it is O(|targets| * |build edge|) but if you do not care to save bits of type of critical time(uint64_t) very much , we can use the longest path length (suppose N ) of whole graph as lowest priority target max critical time, and assign all targets initial critical time as
..... 3N 2N N 0
so we can just walk whole graph once to get N, then just follow nico's previous implementation, mark all critical time for all edges at once. it may waste more bits than O(|targets| * |build edge|) , but I think which is also desirable.
I've re-construct my code to get a better implementation
this is a quick implementation which improve my own build process immediately, useful when the longest job build time much more than all other small jobs. in my own test (3500 sources), total saved build time (8 min )is much more than this algorithms spent( which I havn't measured this exactly, at least I havn't seen big latency before first job is starting build) .
to my suprise, without explicit targets specified, this critical priority list didn't show big win over default ninja(just as nico mentioned), objects on ciritical path do build much earlier than default, but the link step, didn't being picked earlier very much. so I think the priority logic may still have something missed.
my initial implementation is incorrect
|
|
||
| /// Reset state. Clears want and ready sets. | ||
| void Reset(); | ||
| void ComputePriorityList(BuildLog* build_log); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No test for this function?
src/build.cc
Outdated
| } | ||
| edgesQ.push(first); | ||
|
|
||
| first->set_critical_time(max(first->critical_time(),prev_top_crit)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be corrected as : first->set_critical_time(max(max(first->critical_time(),prev_top_crit),first->run_time_ms));
6eb094d to
6c97166
Compare
|
Hi @atetubou , thank you for your review, I've re-construct my code , now it use total build time of all edges as weight factor, much simpler and only added a little modification on nico's previous implementation. for easy reviewing my modification, I didn't delete the test codes ( the if(true) if (false) block and printf ), just make them not crash when you test them. |
|
Hi @domidimi , yes we should move debug output out of here, I'll address this in following commits. and for the vector usage, we need to preserve the order of targets appeared in command so I choose vector, sort-uniq and set indeed gives better performance for deduplicates, but both require more code to recovery their original order. when user provides targets through command line , they may hit command line length limit eariler before hitting deduplicate bottlelock. if this is really a problem (such as making ninja as a library and supplie targets by API), I'll recreate a better one (but with more codes). any suggestion ? |
|
Here a possible implementation for a tool. Not completely tested yet: int NinjaMain::ToolSchedule(const Options* /*options*/, int argc, char* argv[]) {
Plan plan;
vector<Node*> targets;
string err;
if (!CollectTargetsFromArgs(argc, argv, &targets, &err)) {
Error("%s", err.c_str());
return 1;
}
DependencyScan scan(&state_, &build_log_, &deps_log_, &hash_log_, &disk_interface_);
for (vector<Node*>::iterator it = targets.begin(); it != targets.end(); ++it) {
scan.RecomputeDirty(*it, &err);
if (!plan.AddTarget(*it, &err)) {
if (!err.empty()) {
Error("%s", err.c_str());
return 1;
}
}
}
plan.CriticalPathSchedule(&build_log_);
printf("priority list: \n");
for (list<Edge*>::iterator it = plan.priority_list_.begin(), end = plan.priority_list_.end();
it != end; ++it) {
Edge* edge = *it;
string input_str;
if (edge->inputs_.empty()) {
input_str = "empty";
} else {
input_str = edge->inputs_[0]->path();
}
Node* output = edge->outputs_[0];
printf("%s %s crit %d edge %d\n", input_str.c_str(),
output->path().c_str(), edge->critical_time(), edge->run_time_ms_);
}
return 0;
} |
|
What's the status of this pull request? Can I help somehow in landing this? In addition, I have a question/proposed change - it seems that having a separate file containing the previous runtimes of edges as opposed to always relying on log may be better as it prevents subsequent runs of ninja for different targets from removing the available information (e.g., assuming |
|
hi @rkhlebnikov I have no idea neither ... seems no one is reviewing this PR now . And for separate log topic, if I remember correctly , ninja won't purge log even if you build independent target , it will keep all command log until GC happened. so this should not be a problem (please correct me if this is not the case) |
|
|
||
| /// user provided targets in build order, earlier one have higher priority | ||
| vector<Node*> targets_; | ||
| list<Edge*> priority_list_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything stopping you from using a std::vector for the priority_list_, too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought at build.cc line 359
for (list<Edge*>::iterator it = priority_list_.begin(),
end = priority_list_.end();
it != end; ++it) {
i = ready_.find(*it);
if (i != ready_.end()) {
priority_list_.erase(it);
break;
}
}
this may remove element from priority_list_ at middle pos , list is more friendly ,and erase don't need a extra resize . (maybe vector is faster for iterate? I didn't benchmark this. I didn't even notice this changes because I just added minimal logic on original cpsched) , if vector is better choice ,I'll modify this.
This comment was marked as abuse.
This comment was marked as abuse.
build time for 'all': 8.2s -> 7.7s, 6% win @ 10 parallel processes only 7.8 -> 7.6 at @ 8 processes (on a laptop with 8 virtual cores)
when targets is given in command line , gives higher build priority to target appeared earlier. for example you know longestjob1 longjob2 in project have not been built before, you can run ninja longestjob1 longjob2 all to guild ninja schedule them first to improve overall build time. (this is only needed when such long job have never been built before for first time. ninja always auto schedule the best build order for all targets which built before). this is implmented by assign weight * priority as initial value when compute edge critical time. weight is computed as "total time of all targets built in serial" so its always big enough to assume forward edges of higher priority target always got higher value than edges on lower ones.
|
@jonesmz long time since my last try, I just make this 'compile' on top of master, but have no time/resource to verify these logic still works as expected (of course built binary accept explicit target and successfully build). If you have build environment powerful enough, feel free to test/improve this. |
this work based on nico's critical path schedule code, plus suporting
user controlled explicit build order. when targets is given in command line ,
try to build earlier appeared target first, for example
ninja verylongjob all
user can use this feature to let ninja schedule long job first to
improve overall build time. after initial build ,following builds can
also pick history data to find shortest build plan.
depends on target build time , this implementation may have a limitation which user may not control more than 42 targets build in order (use uint64_t to determine critical time ,if last build job is 1 hour ,about 2^22 ms , one higher priority job may double that number), ninja still work when this limit exceeds, just can not pick the ordered target (they may have same uint64 max critical time) I think this should be enough for most build.update: latest implementation do not have such limit. you can add much more targets ( about 2^64 / total_time , almost unlimited)
my test result posted here #376
it may also resolve #232