Skip to content

Commit ed6105b

Browse files
ppwwyyxxfacebook-github-bot
authored andcommitted
update docs
Summary: Pull Request resolved: fairinternal/detectron2#310 Differential Revision: D17975608 Pulled By: ppwwyyxx fbshipit-source-id: 5397020062fb7b6c2041b9e0e312d02c1721bb47
1 parent 951417e commit ed6105b

File tree

4 files changed

+47
-39
lines changed

4 files changed

+47
-39
lines changed

MODEL_ZOO.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -549,7 +549,7 @@ These baselines are described in Table 3(c) of the [LVIS paper](https://arxiv.or
549549

550550
NOTE: the 1x schedule here has the same amount of __iterations__ as the COCO 1x baselines.
551551
They are roughly 24 epochs of LVISv0.5 data.
552-
The final results of these configs has large variance across different runs.
552+
The final results of these configs have large variance across different runs.
553553

554554
<!--
555555
./gen_html_table.py --config 'LVIS-InstanceSegmentation/mask*50*' 'LVIS-InstanceSegmentation/mask*101*' --name R50-FPN R101-FPN X101-FPN --fields lr_sched train_speed inference_speed mem box_AP mask_AP

datasets/README.md

Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
For a few datasets that detectron2 natively supports,
33
the datasets are assumed to exist in a directory called
44
"datasets/", under the directory where you launch the program.
5-
with the following directory structure:
5+
They need to have the following directory structure:
66

77
## Expected dataset structure for COCO instance/keypoint detection:
88

@@ -17,7 +17,7 @@ coco/
1717

1818
You can use the 2014 version of the dataset as well.
1919

20-
Some of the builtin tests (`run_*_tests.sh`) uses a tiny version of the COCO dataset,
20+
Some of the builtin tests (`dev/run_*_tests.sh`) uses a tiny version of the COCO dataset,
2121
which you can download with `./prepare_for_tests.sh`.
2222

2323
## Expected dataset structure for PanopticFPN:
@@ -28,6 +28,7 @@ coco/
2828
panoptic_{train,val}2017.json
2929
panoptic_{train,val}2017/
3030
# png annotations
31+
panoptic_stuff_{train,val}2017/ # generated by the script mentioned below
3132
```
3233

3334
Install panopticapi by:
@@ -36,13 +37,13 @@ pip install git+https://github.com/cocodataset/panopticapi.git
3637
```
3738
Then, run `./prepare_panoptic_fpn.py`, to extract semantic annotations from panoptic annotations.
3839

39-
## Expected dataset structure for LVIS instance detection/segmentation:
40+
## Expected dataset structure for LVIS instance segmentation:
4041
```
4142
coco/
4243
{train,val,test}2017/
4344
lvis/
4445
lvis_v0.5_{train,val}.json
45-
lvis_v0.5_image_info_test.json
46+
lvis_v0.5_image_info_test.json
4647
```
4748

4849
Install lvis-api by:
@@ -56,8 +57,8 @@ cityscapes/
5657
gtFine/
5758
train/
5859
aachen/
59-
color.png, instanceIds.png, labelIds.png, polygons.json
60-
labelTrainIds.png (created by cityscapesscripts/preparation/createTrainIdLabelImgs.py)
60+
color.png, instanceIds.png, labelIds.png, polygons.json,
61+
labelTrainIds.png
6162
...
6263
val/
6364
test/
@@ -71,10 +72,14 @@ Install cityscapes scripts by:
7172
pip install git+https://github.com/mcordts/cityscapesScripts.git
7273
```
7374

75+
Note:
76+
labelTrainIds.png are created by `cityscapesscripts/preparation/createTrainIdLabelImgs.py`.
77+
They are not needed for instance segmentation.
78+
7479
## Expected dataset structure for Pascal VOC:
7580
```
7681
VOC20{07,12}/
7782
Annotations/
78-
ImageSets/
79-
JPEGImages/
83+
ImageSets/
84+
JPEGImages/
8085
```

detectron2/data/catalog.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@ def register(name, func):
3434
name (str): the name that identifies a dataset, e.g. "coco_2014_train".
3535
func (callable): a callable which takes no arguments and returns a list of dicts.
3636
"""
37+
assert callable(func), "You must register a function with `DatasetCatalog.register`!"
3738
DatasetCatalog._REGISTERED[name] = func
3839

3940
@staticmethod

docs/tutorials/datasets.md

Lines changed: 32 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -46,42 +46,42 @@ can load an image from "file_name" if the "image" field is not available.
4646
+ `sem_seg_file_name`: the full path to the ground truth semantic segmentation file.
4747
+ `image`: the image as a numpy array.
4848
+ `sem_seg`: semantic segmentation ground truth in a 2D numpy array. Values in the array represent
49-
category labels.
49+
category labels.
5050
+ `height`, `width`: integer. The shape of image.
5151
+ `image_id` (str): a string to identify this image. Mainly used during evaluation to identify the
52-
image. Each dataset may use it for different purposes.
52+
image. Each dataset may use it for different purposes.
5353
+ `annotations` (list[dict]): the per-instance annotations of every
54-
instance in this image. Each annotation dict may contain:
55-
+ `bbox` (list[float]): list of 4 numbers representing the bounding box of the instance.
56-
+ `bbox_mode` (int): the format of bbox.
57-
It must be a member of
58-
[structures.BoxMode](../modules/structures.html#detectron2.structures.BoxMode).
59-
Currently supports: `BoxMode.XYXY_ABS`, `BoxMode.XYWH_ABS`.
60-
+ `category_id` (int): an integer in the range [0, num_categories) representing the category label.
61-
The value num_categories is reserved to represent the "background" category, if applicable.
62-
+ `segmentation` (list[list[float]] or dict):
63-
+ If `list[list[float]]`, it represents a list of polygons, one for each connected component
64-
of the object. Each `list[float]` is one simple polygon in the format of `[x1, y1, ..., xn, yn]`.
65-
The Xs and Ys are either relative coordinates in [0, 1], or absolute coordinates,
66-
depend on whether "bbox_mode" is relative.
67-
+ If `dict`, it represents the per-pixel segmentation mask in COCO's RLE format.
68-
+ `keypoint`s (list[float]): in the format of [x1, y1, v1,..., xn, yn, vn].
69-
v[i] means the visibility of this keypoint.
70-
`n` must be equal to the number of keypoint categories.
71-
The Xs and Ys are either relative coordinates in [0, 1], or absolute coordinates,
72-
depend on whether "bbox_mode" is relative.
73-
74-
Note that the coordinate annotations in COCO format are integers in range [0, H-1 or W-1].
75-
By default, detectron2 adds 0.5 to absolute keypoint coordinates to convert them from discrete
54+
instance in this image. Each annotation dict may contain:
55+
+ `bbox` (list[float]): list of 4 numbers representing the bounding box of the instance.
56+
+ `bbox_mode` (int): the format of bbox.
57+
It must be a member of
58+
[structures.BoxMode](../modules/structures.html#detectron2.structures.BoxMode).
59+
Currently supports: `BoxMode.XYXY_ABS`, `BoxMode.XYWH_ABS`.
60+
+ `category_id` (int): an integer in the range [0, num_categories) representing the category label.
61+
The value num_categories is reserved to represent the "background" category, if applicable.
62+
+ `segmentation` (list[list[float]] or dict):
63+
+ If `list[list[float]]`, it represents a list of polygons, one for each connected component
64+
of the object. Each `list[float]` is one simple polygon in the format of `[x1, y1, ..., xn, yn]`.
65+
The Xs and Ys are either relative coordinates in [0, 1], or absolute coordinates,
66+
depend on whether "bbox_mode" is relative.
67+
+ If `dict`, it represents the per-pixel segmentation mask in COCO's RLE format.
68+
+ `keypoint`s (list[float]): in the format of [x1, y1, v1,..., xn, yn, vn].
69+
v[i] means the visibility of this keypoint.
70+
`n` must be equal to the number of keypoint categories.
71+
The Xs and Ys are either relative coordinates in [0, 1], or absolute coordinates,
72+
depend on whether "bbox_mode" is relative.
73+
74+
Note that the coordinate annotations in COCO format are integers in range [0, H-1 or W-1].
75+
By default, detectron2 adds 0.5 to absolute keypoint coordinates to convert them from discrete
7676
pixel indices to floating point coordinates.
77-
+ `iscrowd`: 0 or 1. Whether this instance is labeled as COCO's "crowd region".
77+
+ `iscrowd`: 0 or 1. Whether this instance is labeled as COCO's "crowd region".
7878
+ `proposal_boxes` (array): 2D numpy array with shape (K, 4) representing K precomputed proposal boxes for this image.
7979
+ `proposal_objectness_logits` (array): numpy array with shape (K, ), which corresponds to the objectness
80-
logits of proposals in 'proposal_boxes'.
80+
logits of proposals in 'proposal_boxes'.
8181
+ `proposal_bbox_mode` (int): the format of the precomputed proposal bbox.
82-
It must be a member of
83-
[structures.BoxMode](../modules/structures.html#detectron2.structures.BoxMode).
84-
Default format is `BoxMode.XYXY_ABS`.
82+
It must be a member of
83+
[structures.BoxMode](../modules/structures.html#detectron2.structures.BoxMode).
84+
Default format is `BoxMode.XYXY_ABS`.
8585

8686

8787
If your dataset is already in the COCO format, you can simply register it by
@@ -146,12 +146,14 @@ Some additional metadata that are specific to the evaluation of certain datasets
146146
* `stuff_dataset_id_to_contiguous_id` (dict[int->int]): Used when generating prediction json files for
147147
semantic/panoptic segmentation.
148148
A mapping from semantic segmentation class ids in the dataset
149-
to contiguous ids in [0, num_categories). It is useful for evaluation only.
149+
to contiguous ids in [0, num_categories). It is useful for evaluation only.
150150

151151
* `json_file`: The COCO annotation json file. Used by COCO evaluation for COCO-format datasets.
152152
* `panoptic_root`, `panoptic_json`: Used by panoptic evaluation.
153153
* `evaluator_type`: Used by the builtin main training script to select
154154
evaluator. No need to use it if you write your own main script.
155+
You can just provide the [DatasetEvaluator](../modules/evaluation.html#detectron2.evaluation.DatasetEvaluator)
156+
for your dataset directly in your main script.
155157

156158
NOTE: For background on the difference between "thing" and "stuff" categories, see
157159
[On Seeing Stuff: The Perception of Materials by Humans and Machines](http://persci.mit.edu/pub_pdfs/adelson_spie_01.pdf).

0 commit comments

Comments
 (0)