Approach for Easy Visual Comparison between ground-truth and predicted classes

Although classification metrics are good for summarizing a model’s performance on a dataset, they disconnect the user from the data itself. Similarly, a confusion matrix might tell us that performance is suffering because of false positives, but it obscures information about what patterns may have caused those misclassifications and what types of false positives there might be. 

One way to gain interpretability is to group sampled images by the category of their output (true negative, false negative, false positive, true positive), and display them in a powerpoint file for facile review. These visualizable categories make it easy to identify patterns in misclassified data that can be exploited to improve performance (e.g., hard negative mining, or image analysis based filtering).

This blog post describes and demonstrates a workflow that produces such a powerpoint slide deck automatically for review, as shown below:

Continue reading Approach for Easy Visual Comparison between ground-truth and predicted classes

Using QuPath To Help Identify An Optimal Threshold For A Deep Or Machine Learning Classifier

Digital pathology projects often require assigning a class to cells/objects. For example, you may have a segmentation of cells/glomeruli/tubules and want to identify the ones which are lymphocytes/sclerotic/distal. This classification process can be done using machine or deep learning classifiers by supplying the object of question and receiving an output score which indicates the likelihood that that particular object is of that particular type.

This blog post will demonstrate an efficient way of using QuPath to help find the ideal likelihood threshold for your classifier.

Continue reading Using QuPath To Help Identify An Optimal Threshold For A Deep Or Machine Learning Classifier

A masterclass in Scientific CV writing

Introduction: Another day, another application form

Writing applications for jobs, grants and all manner of other reviews is a continual process within the scientific World. Forms tend to ask for specific, nuanced information leading to more of our precious time being spent digging up decades-worth of buried events just to evidence ‘A time I have communicated with a diverse audience’ than actually writing. Then, we have the doubt to contend with: What if I missed something? Surely I have a better example! I remember doing that – but when was it?

Given A) how short academic contracts can be and B) how many distinct workplaces our generation tends to work in over the course of a career, writing CVs can consume a considerable chunk of our adult lives. The application process is not going anywhere in the near future. We need to ask ourselves how we can make it as painless and efficient as possible.

Well, there are a few ‘hacks’. Apply for a few jobs and you will start to notice themes in the application process and in the ‘winning’ CVs. Let’s go over these themes and learn to not only ‘hack’ our time but more importantly, our success rate. Doing so, we can earn back so much more time to do the things we love – science!

Continue reading A masterclass in Scientific CV writing

How to Select the Correct Magnification and Patch Size for Digital Pathology Projects

In digital pathology, input data is often exceedingly too large for DL models to process directly, with Whole Slide Images (WSI) around 100k x 100k pixels. This post provides a quantitative and qualitative method, with code, to help optimize important digital pathology specific hyperparameters: patch size and magnification. Optimizing these variables can decrease training times, lowers hardware requirements, and reduces the amount of data required to effectively train a model.

Read more

Application of ICC profiles to digital pathology images

Background on Color Calibration

Digital whole slide image scanners are designed to take stained tissue on glass slides and digitize them into bytes for usage in the digital world. The process by which slide scanners perform this operation does not produce a perfect digital equivalent of the original slide as the hardware involved (led/blub, camera sensor, quantizer) can introduce some biases during the sampling process. For example, different camera sensors may detect colors with different levels of specificity/accuracy/density, resulting in similar but not perfect representations of the associated real-world subjects.

Concretely, there is often a difference between the color you perceive in the real-world under a microscope versus what you would see if you looked at the corresponding digital copy of the same slide. This blog post discusses how to correct for this discrepancy using ICC profiles.

Continue reading Application of ICC profiles to digital pathology images

Using Paquo to directly interact with QuPath project files for usage in digital pathology machine learning

This is an updated version of the previously described workflow on how to load and classify annotations/detections created in QuPath for usage in downstream machine learning workflows. The original post described how to use the Groovy programming language used by QuPath to export annotations/detections as GeoJSON from within QuPath, made use of a Python script to classify them, and lastly used another Groovy script to reimport them. If you are not familiar with QuPath and/or its annotations you should probably read the original post first to provide better context and understanding of the respective workflows, as well as being able to appreciate the more elegant approach taken here. If you are already using the described approach, you should be able to easily modify it to follow this newer approach.

Continue reading Using Paquo to directly interact with QuPath project files for usage in digital pathology machine learning

Tutorial: Quick Annotator for Tubule Segmentation

The manual labeling of large numbers of objects is a frequent occurrence when training deep learning classifiers in the digital histopathology domain. Often this can become extremely tedious and potentially even insurmountable.

To aid people in this annotation process we have developed and released Quick Annotator (QA), a tool which employs a deep learning backend to simultaneously learn and aid the user in the annotation process. A pre-print explaining this tool in more detail is available [here].

Continue reading Tutorial: Quick Annotator for Tubule Segmentation

Transferring data FASTER to the GPU With Compression

Utilization of current GPUs is often limited by the ability to get the data onto and off the device quickly. More precisely, this means taking data from the host RAM, transferring it over the PCI-e bus to the GPU RAM is the bottleneck of many deep learning use cases.

Continue reading Transferring data FASTER to the GPU With Compression

The noise in our digital pathology slides

In adding new features to HistoQC , I stumbled upon a very interesting insight that I thought I would take a moment to share. The amount of noise and artifacts in digital pathology (DP) whole slide images (WSI) is far more extensive than I had previously thought.

Continue reading The noise in our digital pathology slides