-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Hi team DLC,
I have a particular use case for which I'm currently developing some code, and would like to get your input about 1) whether you have insights on how to best get this done, and 2) whether you would be interested in getting this as a feature (in such case I would create a PR).
Here's the use case: I have a set of videos in which I want to track one animal, but there is a second animal in part of the FOV, contained within a specific area, who can't move much and for which I'm not interested in getting pose estimates. I've tried to analyze one of these videos with a model trained on single-animal videos (hoping it would generalize like in Mathis et al., 2018, Fig. 4b), but it seemed like there could only be one instance of each body part label per frame (so sometimes the left back paw label was placed onto the animal I want to ignore, thus "stealing" the label from the animal of interest). I bypassed this issue by applying a mask on the video before frame extraction, effectively hiding the FOV area containing the second animal (and in which the animal of interest can't physically go). Hope this makes sense so far!
Now, here's the feature idea: instead of rewriting all videos to get the corresponding masked version, I was thinking we could do the same as cropping, i.e., in the config.yaml file, below each video we could have the option to put "mask: " to apply a mask on the video in an online way. The mask could either be a path to an image file of the exact same dimensions than the cropped frames, or even (messier but more "online") parameters for drawing shapes in the FOV (e.g., "circle", center_x, center_y, radius). And there would be relevant flags for all functions (extract_frames obviously, but also analyze_videos and create_labeled_videos).
And that's it, basically it's about adding a built-in masking feature alongside the existing cropping feature, so that we don't have to do this step beforehand loosing time and disk space doing so. I don't know if such use cases where masking is needed occur often though, hence this issue to get your insights on it first. Also, if you have suggestions on better ways to handle this use case than using masks, I'm looking forward to reading them!
Thank you,
Ludovic