Add initial support for generic inference#189
Conversation
c7dd198 to
d648434
Compare
This gets rid of the "image_type" notion introduced in 4e48d71 that required test_all functions with a bunch of yields. Instead, I'm taking advantage of the fact that nose will run tests only in classes that start with "Test". This lets me add a whole new set of tests simply by adding a new class that defines a different value for IMAGE_CHANNELS or CROP_SIZE, etc. This will be helpful when creating tests for NVIDIA#189.
0a5e505 to
4cbc17a
Compare
|
Added tests in 4cbc17a. Can somebody test this and give me some feedback before merging? Since DIGITS can't create the LMDBs for you yet, you can create a test dataset with the script at |
|
@lukeyeager this looks very promising! I have had to make minor tweaks into In order to create the dataset, is it possible to have the user specify In the model creation page, can we have the user choose which loss function they want to use? The 'infer many images' button didn't work for me (it just prints the file names in the .txt file). It would be nice to have path completion working for all the fields where we expect the user to provide a server path. Perhaps we could create a custom |
4cbc17a to
6888dbe
Compare
Oh right, thanks. Fixed that.
Looks good, thanks.
I think this can be a discussion that comes after merging this PR (see "TODO after merging" in my OP). I've branched this out into a new issue for discussion - #197.
What do you mean? They have to choose their loss function manually in their custom prototxt at the moment, since there aren't any standard networks.
Great idea! I was hoping you would come along and add your autocomplete stuff to these fields as well. Doing it with a custom field sounds like a good idea to me. |
|
Not sure if I'm doing this properly at all, but I tried to put in an AlexNet as the deploy.prototxt, and train it against the test data generated via test_lmdb_creator.py via: I'm getting the following error: Just wondering if I'm doing something obviously wrong? |
|
I just realized that AlexNet expects 3 channels input, so the test images I generate is probably not going to work? |
|
@y22ma, thanks for the help reviewing this!
No, you should be fine. The network is pretty flexible.
I ran into that error once, but I fixed it here. Apparently the issue has resurfaced somewhere else. I'm looking into this now ... You can try running digits in debug mode to see if you get any information: |
I overlooked the part where the loss function is specified. This looks OK, sorry. The "infer many" menu is working on your latest commit, thanks. A possible enhancement (I suppose in the context of #197) would be to show the ground truth when specified in the text file. |
|
@lukeyeager no problem, really appreciate this functionality coming together. I think a made a mistake with my previous test by creating only 50 images instead of the intended 5000 images. I suspect that with 50 images, the default batch size are too large. Now I'm running into another issue: Likely due to test_lmdb_creator.py creating only less labels than the output of the softmax layer... |
|
I got AlexNet to work on that exact same set of images. But I had to adjust it to fit the specific problem at hand. Here's what I had to change:
|
Don't require people to explicitly type train_ in the layer name for these layers
More reliable than reading information from the Job or Task
|
@lukeyeager unfortunately I can't reproduce your results after following through your instructions. Here's my train_val.prototxt And here's the error: It seems that the output configuration on fc8 is not taking effect? I'm on commit f3cee35. Note that I'm using the NVIDIA fork of Caffe. |
Oh, well you need to create two |
|
And you should remove the |
|
I'm going to go ahead and merge this PR. For any bugs or requests related to this new set of features, please create new issues or ask for help on the mailing list. Thanks for the review help, @gheinrich and @y22ma! |
Add initial support for generic inference
|
All good, and thanks for all the tips to get it working. It would be awesome to see example use cases of this features documented on the wiki page, it would definitely help alot of people out. |
|
Hi, @lukeyeager , I looked through this issue #177 and have concerns about the pixel classifier or segmentation task. We can construct the LMDB with a given folder structure like this: But do we really need to write them as files? Is there any way to construct the LMDB which holds a portion of memory as an training instance? I mean the training stack and training label are already in the memory. While training the model, a patch and corresponding labels will be extracted on the fly. |
|
Until we decide on a solution to #197, you'll have to create your LMDB for this task manually anyway. The
Not with LMDB, no. You might be able to do something like that with a If you get that working and you'd like to use it in DIGITS, please open a separate issue with your request. |
|
@lukeyeager Do you happen to have any reference to create a LMDB for object detection (e.g. R-CNN)? Appreciate any tip you have. |
|
I do not, sorry. Your best bet would probably be the Caffe mailing list. |

Adds a new type of task to DIGITS -
Generic Inference.Solves #97, #117, #177
DIGITS previously only supported "Image Classification," and made assumptions about the types of networks being used, the format of input data, and the way the output of the model should be interpreted.
The new task is more generalized, so you can do other things like object detection or per-pixel segmentation. The network can have one or more n-dimensional blobs, which DIGITS does not try to interpret in any way. I wrestled with a bunch of different names for this -
Regression,General-Purpose Networks,Multi-blob Output Models,Dense Prediction,Other Networks, etc. None was quite accurate and simple enough, so I'm going withGeneric Inference.Remaining limitations
This is much more generic, but DIGITS still puts some restrictions on what you can do with your network:
DataTODO before merging
TODO after merging