Optimize rake by replacing ipfn with vectorized IPF by neuralsorcerer · Pull Request #135 · facebookresearch/balance

neuralsorcerer · 2025-11-10T08:39:09Z

Added a vectorized _run_ipf_numpy helper that mirrors the original ipfn behaviour while avoiding the package dependency. Reworked the raking workflow to feed the new solver, rebuild the cell-weight mapping via DataFrame joins, and return the historical rake_weight series shape.

meta-codesync · 2025-11-10T15:20:37Z

@talgalili has imported this pull request. If you are a Meta employee, you can view this in D86672684.

talgalili · 2025-11-10T15:24:35Z

Thanks @neuralsorcerer - this is super cool!

Since this is a 'big deal' change, could you please:

Consider buffing up the tests for raking - just to make sure we're not missing any edge cases for this change? (if you think it's good as is, then feel free to keep it as is). Test file: https://github.com/facebookresearch/balance/blob/main/tests/test_rake.py
Add code (and output) for doing a benchmark before and after the change - just so we'll get a sense of the gain? while I like skipping ipdf from the dependency, I want to make sure there are no edge cases that ipfn will solve now (or in the future), that we will miss.

WDYT?

neuralsorcerer · 2025-11-10T15:31:29Z

Consider buffing up the tests for raking - just to make sure we're not missing any edge cases for this change? (if you think it's good as is, then feel free to keep it as is)

I was planning to add more tests incrementally in upcoming PRs. I am happy to add more tests in this PR if we are fine with it.

Add code (and output) for doing a benchmark before and after the change - just so we'll get a sense of the gain? while I like skipping ipdf from the dependency, I want to make sure there are no edge cases that ipfn will solve now (or in the future), that we will miss.

That's a great suggestion. Will do it :)

talgalili · 2025-11-10T16:50:28Z

Regarding 2 - thanks! Regarding 1 - doing more ipfn tests as a separate PR is best. And then we can check the current diff won't break them. This would increase our confidence that no edge cases are misssed.

…

On Monday, November 10, 2025, Soumyadip Sarkar ***@***.***> wrote: *neuralsorcerer* left a comment (facebookresearch/balance#135) <#135 (comment)> 1. Consider buffing up the tests for raking - just to make sure we're not missing any edge cases for this change? (if you think it's good as is, then feel free to keep it as is) I was planning to add more tests incrementally in upcoming PRs. I am happy to add more tests in this PR if we are fine with it. 2. Add code (and output) for doing a benchmark before and after the change - just so we'll get a sense of the gain? while I like skipping ipdf from the dependency, I want to make sure there are no edge cases that ipfn will solve now (or in the future), that we will miss. That's a great suggestion. Will do it :) — Reply to this email directly, view it on GitHub <#135 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAHOJBQEXXVHL3XC2LUTKVL34CV5VAVCNFSM6AAAAACLUJFVPOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTKMJSGQZDCNBQGY> . You are receiving this because you were mentioned.Message ID: ***@***.***>

facebook-github-bot · 2025-11-11T08:25:47Z

@neuralsorcerer has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2025-11-11T08:39:52Z

@neuralsorcerer has updated the pull request. You must reimport the pull request before landing.

neuralsorcerer · 2025-11-11T08:43:28Z

Added benchmark_ipfn.py file under benchmark folder for the benchmark code.

`ipfn` package	our in-tree numpy solver
1.83ms	1.15ms

talgalili · 2025-11-11T09:52:33Z

Great job, thanks @neuralsorcerer !

I'm working to get this land in the coming hour.

talgalili · 2025-11-11T18:23:54Z

FYI @neuralsorcerer
Just wanted to let you know I'm still working on this diff and that it hasn't landed yet because there are a bunch of internal errors I'm sorting out on the files. I suspect it will land within 24 hours after I finish fixing them.

neuralsorcerer · 2025-11-11T18:39:16Z

No hurries and thank you for working on it @talgalili

meta-codesync · 2025-11-11T20:16:08Z

@talgalili merged this pull request in 21d657a.

Optimize rake by replacing ipfn with vectorized IPF

d7fe713

meta-cla bot added the cla signed label Nov 10, 2025

Fix lints

1f5eb35

Add tests

5e26c8a

Add benchmark

1147112

meta-codesync bot closed this in 21d657a Nov 11, 2025

facebook-github-bot added the Merged label Nov 11, 2025

neuralsorcerer deleted the ipfn branch November 12, 2025 07:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize rake by replacing ipfn with vectorized IPF#135

Optimize rake by replacing ipfn with vectorized IPF#135
neuralsorcerer wants to merge 4 commits intofacebookresearch:mainfrom
neuralsorcerer:ipfn

neuralsorcerer commented Nov 10, 2025

Uh oh!

meta-codesync bot commented Nov 10, 2025

Uh oh!

talgalili commented Nov 10, 2025 •

edited

Loading

Uh oh!

neuralsorcerer commented Nov 10, 2025

Uh oh!

talgalili commented Nov 10, 2025 via email

Uh oh!

facebook-github-bot commented Nov 11, 2025

Uh oh!

facebook-github-bot commented Nov 11, 2025

Uh oh!

neuralsorcerer commented Nov 11, 2025

Uh oh!

talgalili commented Nov 11, 2025

Uh oh!

talgalili commented Nov 11, 2025

Uh oh!

neuralsorcerer commented Nov 11, 2025

Uh oh!

meta-codesync bot commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

neuralsorcerer commented Nov 10, 2025

Uh oh!

meta-codesync bot commented Nov 10, 2025

Uh oh!

talgalili commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

neuralsorcerer commented Nov 10, 2025

Uh oh!

talgalili commented Nov 10, 2025 via email

Uh oh!

facebook-github-bot commented Nov 11, 2025

Uh oh!

facebook-github-bot commented Nov 11, 2025

Uh oh!

neuralsorcerer commented Nov 11, 2025

Uh oh!

talgalili commented Nov 11, 2025

Uh oh!

talgalili commented Nov 11, 2025

Uh oh!

neuralsorcerer commented Nov 11, 2025

Uh oh!

meta-codesync bot commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

talgalili commented Nov 10, 2025 •

edited

Loading