DIP4FISH: 06.21

Tuesday, June 15, 2021

Try to unstuck: back to COCO

The last post is from October 2020. The main line of conduct was to progress on chromosome instance segmentation, but a robust semantic, U-net based, would have satisfying too (possibly using Fastai).

Detectron2, PixelLib and many others provide instance segmentation algorithms (Mask RCNN for example). To train a model, the COCO format for the so called ground-truth labels seems to be mandatory. The issue is that the different datasets generated to simulate overlapping chromosomes, the labels are grey scaled images decomposable into binary masks for one-hot encoding:

In the last try, the idea was to start from the COCO specs and to write some code to convert the binary masks into COCO files but that was a fail as detectron2 didn't want my minimalist dataset.

Making a minimal valid coco data

From a grey-scaled image a coco file is generated using an interactive online tool as https://www.makesense.ai/:

The coco dataset corresponding to this single image is a json file :

With a xml viewer in Colab, we can see how the file is structured:

The file corresponds to only one image:"grey0000001.png"

The two chromosomes annotated appeared as id:0 and id:0 in the annotations field:

The contour of one of the two chromosomes is coded a 24 values. Possibly 12 pairs of coordinates:

The chromosome bounding box seems to be defined by two diagonal points, so we have a pair of coordinates:

Finally, there are only one category of instances: "chromosome"

Back to COCO and play with a minimalist valid dataset with pycocotools:

Subscribe to: Posts (Atom)