-
Notifications
You must be signed in to change notification settings - Fork 6.7k
Refine the documentation of im2rec #12606
Conversation
|
@mxnet-label-bot[pr-awaiting-review] |
lupesko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just one clarification.
| 972457 633 n07723039_1627.JPEG | ||
| 7534 11 n01630670_4486.JPEG | ||
| 1191261 249 n12407079_5106.JPEG | ||
| 95099 464.000000 n04467665_17283.JPEG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the decimal point really required? this is a bit weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the data.lst file is generated by im2rec.py instead of doing it manually, the label will have those decimal point. I think it would be less confused for users?
And the reason why it uses floating point is that the label value could be generated by the regression, e.g. 68.6 kg for a human body weight.
sandeep-krishnamurthy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
However, we need to revisit why do we need 2 versions and if required, why is there a discrepency in functionality.
aaronmarkham
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize this was already merged, but I have suggestions nonetheless.
| * RecordIO has a simple way to partition, simplifying distributed setting. We provide an example later. | ||
|
|
||
| We provide the [im2rec tool](https://github.com/dmlc/mxnet/blob/master/tools/im2rec.cc) so you can create an Image RecordIO dataset by yourself. The following walkthrough shows you how. | ||
| We provide the [im2rec tool](https://github.com/dmlc/mxnet/blob/master/tools/im2rec.cc) so you can create an Image RecordIO dataset by yourself. The following walkthrough shows you how. Note that there is python version of [im2rec tool](https://github.com/apache/incubator-mxnet/blob/master/tools/im2rec.py) and [example](https://mxnet.incubator.apache.org/tutorials/basic/data.html) using real-world data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We provide two tools for creating a RecordIO dataset.
- im2rec.cc - implements the tool using the C++ API.
- im2rec.py - implements the tool using the Python API.
Both provide the same output: a RecordIO dataset.
(Then take this mention and add it later for "Next Steps". I don't think you want them leaving this FAQ/tutorial quite yet.)
You may want to also review the example using real-world data with im2rec.py.
|
|
||
| ### Step 1. Make an Image List File | ||
|
|
||
| * Note that the im2rec.py provide a param `--list` to generate the list for you but im2rec.cc doesn't support it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
provides
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you, but
| ``` | ||
|
|
||
| #### Using tools/im2rec.py | ||
| You can also convert raw images into *RecordIO* format using the ``im2rec.py`` utility script that is provided in the MXNet [src/tools](https://github.com/dmlc/mxnet/tree/master/tools) folder. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the repo link
| You can also convert raw images into *RecordIO* format using the ``im2rec.py`` utility script that is provided in the MXNet [src/tools](https://github.com/dmlc/mxnet/tree/master/tools) folder. | ||
| An example of how to use the script for converting to *RecordIO* format is shown in the `Image IO` section below. | ||
|
|
||
| * Note that there is a C++ version of [im2rec](https://github.com/dmlc/mxnet/blob/master/tools/im2rec.cc), please refer to [here](https://mxnet.incubator.apache.org/faq/recordio.html) for more information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't link only "here". Provide the full description of what the link is going to.
Note that there is a C++ API implementation of im2rec. Refer to the RecordIO FAQ for more information.
Description
Currently, we have two im2rec tools. One is python, the other one is C++. There are slightly different in terms of functionality. It helps to solve the #11884 as well.
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments
@aaronmarkham
Please let me know how can I make it less confused.