This blog is dedicated to Digital Image Processing for fluorescence in-situ hybridization, QFISH and other things about the telomeres.
Monday, October 26, 2020
Looking inside the "2164 dataset": How balanced is it?
Friday, August 7, 2020
COCO dataset from scratch : try and fail ...
Detectron2 provides several algorithms for instance segmentation, so it was tempting to submit the overlapping datasets to one of those. However, to use one of these algorithms, the dataset format seem to follow the MS-COCO format.
One available dataset consists in 2164 pairs of grayscaled+groundtruth images.To give a try, a minimalist dataset with one image and two instances could be converted to COCO format:
The two instances (right) are obtained from the groundtruth image showing the overlapping chromosomes. The instances are numpy arrays which can be saved as png images. To generate a COCO dataset associated to the gray scaled image (left), the following steps were followed:
- generate a python dictionary according to the COCO format specification found in the detectron2 documentation and convert the binary masks to their bounding boxes and compressed rle using pycocotools.
- Save the dictionary as a json file
- Load the json file with pycocotools (or detectron2) in order to visualize if possible the instances overlaying the gray scaled image.
The whole process is available in a jupyter notebook on Kaggle.
Unfortunately, the dataset is not a legit COCO dataset as the dataset registration fails. Hope to get some help on Pytorch forum or from stackoverflow.
Tuesday, April 28, 2020
Karyotype of ten fibroblasts after Alu sequences and telomeres hybridization
Pairs of chromosomes are ordered by columns from HSA 1 (left) to XY (right) . Metaphases are ordered by row. |
Example of a metaphase after alignment of Alu image on DAPI / telomeres images:
Karyotyping:
Friday, January 31, 2020
2164 full resolution pairs of synthetic images of two overlapping chromosomes
Repair of the "overlapping_chromosomes_examples.h5" dataset:
- Some grayscaled images components had black dots: those images were removed (with their corresponding groundtruth labels).
- The images dtype was int64, it is now np.uint8
Dataset format:
Thursday, January 16, 2020
A conversion to COCO dataset format task in sight
Start to think about how to convert the different overlapping chromosomes datasets into a coco dataset: