Skip to content

Improving IsolationForest predict time #25150

@adrinjalali

Description

@adrinjalali

Discussed in #25142

Originally posted by fsiola December 8, 2022
Hi,

When using IsolationForest predict, we go down the path to _compute_score_samples. This executes tree.apply and tree.decision_path. Both calls will iterate over the tree for each sample in X. So we are evaluation the tree 2 times.

tree.decision_path returns a csr matrix containing the nodes indexes that were visited in the tree, to them later just have the count of indexes summed later.

We can save time in predict if instead of calling tree.decision_path, a tree.decision_path_length that return an integer exists. But that would required changing the _tree.pyx file. Some changes could also avoid the call to tree.apply, avoiding 2 times iterating on the tree.

Is this something that would be accepted as PR, or changing the tree cpython files for this would not be accepted?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions