DIP4FISH: 2021

Thursday, October 14, 2021

Installation of lightning-flash

Having anaconda installed on a ubuntu 20.04 box:

Create a virtual environment, specifying the disk:

conda create --prefix /mnt/stockage/Developp/EnvPLFlash

and activate the env with:

conda activate /mnt/stockage/Developp/EnvPLFlash

Then install the libs starting with pytorch with cuda support:

To have pytorch 1.8 with cuda support:

conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch-lts

then

pip install icedata

pip install lightning-flash

pip install notebook

pip install voila

Without forgetting to install lightning-flash[image] to get the instance segmentation algorithms

pip install 'icevision' 'lightning-flash[image]'

The installation can be checked running the following notebook:

Wednesday, July 7, 2021

Back to COCO 2: A less minimalistic 125 images +json dataset

125 grey scaled images were chosen from a previous dataset available on github.

An annotation file was generated by hand online with makesense.ai and saved as a unique json file in COCO format. This small dataset is freely available as an archive.

Check annotation file validity with pycococreator.

Pycococreator by Waspinator was used to display the annotations (aka the segmentation) over a grey scaled image in the following jupyter notebook

validating de facto the annotation file produced with makesense.ai:

Data registration in detectron2:

This is the next step.

The idea is to follow the tutorial on custom dataset registration, possibly using the balloons example by davamix.

Tuesday, June 15, 2021

Try to unstuck: back to COCO

The last post is from October 2020. The main line of conduct was to progress on chromosome instance segmentation, but a robust semantic, U-net based, would have satisfying too (possibly using Fastai).

Detectron2, PixelLib and many others provide instance segmentation algorithms (Mask RCNN for example). To train a model, the COCO format for the so called ground-truth labels seems to be mandatory. The issue is that the different datasets generated to simulate overlapping chromosomes, the labels are grey scaled images decomposable into binary masks for one-hot encoding: