Friday, January 31, 2020

2164 full resolution pairs of synthetic images of two overlapping chromosomes

After having fixed the groundtruth images of the "13434" dataset, an older but full resolution dataset, has to be repaired too.

Repair of the "overlapping_chromosomes_examples.h5" dataset:

This dataset contained originally 2854 (grayscaled+groundtruth) pairs of 190x189 images, stored in a unique numpy array. Its shape was 2854x190x189x2.

The grayscaled images could suffer from two problems:
  • Some grayscaled images components had black dots: those images were removed (with their corresponding groundtruth labels).

  • The images dtype was int64, it is now np.uint8
The overlapping domain is also now more realistic compared with real overlapping chromosomes:

The labels of the groundtruth don't have no more spurious pixels:

Dataset format:

Once downloaded, the dataset shape is (2164, 190, 189, 2) available as:

Download the dataset with a jupyter notebook:

 

 

Thursday, January 16, 2020

A conversion to COCO dataset format task in sight

Detectron 2 provides numerous models for semantic/instance segmentation. Contrary to tensorflow 2, the couple pytorch1.3/detectron doesn't seem to require an avx capable CPU. However, detectron 2 works on datasets build according to the COCO format.

Start to think about how to convert the different overlapping chromosomes datasets into a coco dataset: