Using QuPath To Help Identify An Optimal Threshold For A Deep Or Machine Learning Classifier

Digital pathology projects often require assigning a class to cells/objects. For example, you may have a segmentation of cells/glomeruli/tubules and want to identify the ones which are lymphocytes/sclerotic/distal. This classification process can be done using machine or deep learning classifiers by supplying the object of question and receiving an output score which indicates the likelihood that that particular object is of that particular type.

This blog post will demonstrate an efficient way of using QuPath to help find the ideal likelihood threshold for your classifier.

Since different images may have different levels of performance with the same classifier, (often attributed to domain shift), it is occasionally necessary to select cohort-specific thresholds for a classifier to perform ideally. There may also be a need to choose a different threshold for each whole slide image (WSI) to obtain the most accurate slide-level results.

This threshold selection process, in its simplest form, sees moving the threshold by which an object is considered “positive” or “negative” as being of a certain type (see image above). To select the threshold, domain experts (such as pathologists) often need to check results qualitatively or quantitatively (if ground truth values are available) at different thresholds.

A simple approach to enable review of different thresholds is to prepare images with e.g., the resulting classified cells at different threshold values and to let the pathologists compare them and choose the best one. However, this approach is not ideal because (a) it not only limits the number of choices available for review, taking a real-valued threshold and discretizing it into only a handful of options for review, but (b) also results in significant amounts of data storage/transfer requirements (i.e., if you want to test 10 thresholds on 100 images, you end up having to create and send 1000 images!).

The more ideal way of performing this threshold selection process is to enable an expert to work directly with the real-value data. This enables them to choose exactly the threshold they want using a slider…..which is actually quite easy to set up in QuPath.

Lets take a look!

Python Work – File Creation

To be able to visualize annotations with different threshold values in QuPath, there are two solutions: (a) creating a GeoJSON file or (b) using Paquo.

GeoJSON Approach

For the GeoJSON solution, you need to create a GeoJSON file that will be recognized by QuPath. We’ve discussed this topic extensively in our previous post, but as a brief reminder, the format is as follows (toy example with 2 cells):

[{"type": "Feature", 
 "id": "PathCellObject", 
 "geometry": {"type": "Polygon", "coordinates": [[[886, 8], [883, 42], [884, 43], [891, 8], [886, 8]]]}, 
 "properties": {"objectType": "cell", "isLocked": false, "measurements": [{"name": "Prediction", "value": -4.61}]}},
 {"type": "Feature", 
 "id": "PathCellObject", 
 "geometry": {"type": "Polygon", "coordinates": [[[303, 20], [294, 30], [294, 36], [295, 37], [296, 40], [303, 20]]]}, 
 "properties": {"objectType": "cell", "isLocked": false, "measurements": [{"name": "Prediction", "value": 1.25}]}}]

The contours must be in geometry -> coordinates and the output of the classifier in properties -> measurements with a name (here “Prediction“, but this could be renamed as you desire), and the associated value under consideration for thresholding.

Code example to create a GeoJSON file with 2 cells:

  1. import geojson
  2. import uuid
  4. def generate_geojson(cells):
  5.     """
  6.    Convert  the contours and a prediction value for each cell in
  7.    a GeoJSON file (for QuPath).
  9.    cells: list of annotated cells on the image (each cell is
  10.            a dict with a contour and a pred)
  11.    """
  12.     data = []
  13.     for cell in cells:
  14.         dict = {}
  15.         coords = cell["contour"]
  16.         # to make sure the shape is closed
  17.         coords.append(coords[0])
  18.         dict["type"] = "Feature"
  19.         dict["id"] = "PathCellObject" # QuPath >= 4: can be a unique id (str(uuid.uuid1()))
  20.         dict["geometry"] = {"type": "Polygon", "coordinates": [coords]}
  21.         dict["properties"] = {"objectType": "cell", "isLocked": False,
  22.                               "measurements": [{"name": "Prediction",
  23.                                                 "value": cell["pred"]}]}
  24.         data.append(dict)
  25.     return data
  29. # +
  30. point1={"contour":[[886, 8], [883, 42], [884, 43], [891, 8], [886, 8]],"pred":'4.61'}
  31. point2={"contour": [[303, 20], [294, 30], [294, 36], [295, 37], [296, 40], [303, 20]],"pred":'1.25'}
  34. with open(f"example.geojson", 'w') as outfile:
  35.         geojson.dump(generate_geojson([point1,point2]), outfile)

For QuPath < 0.4, the id should be PathCellObject to be able to use the single cell classifier in the next step. In QuPath >= 0.4, this can be a unique id, usually made with uuid.uuid().

Paquo Approach

For the second solution with Paquo, you can interact directly with QuPath project files. It was also discussed previously in our blog post.

Code example to add the annotations (they are, in fact, “detection” type- objects!):

  1. import paquo
  2. import os
  3. os.environ["PAQUO_QUPATH_DIR"] = r"C:\Users\ajanowc\AppData\Local\QuPath-0.4.0"
  5. from paquo.projects import QuPathProject
  6. from paquo.images import QuPathImageType
  8. from shapely.geometry import Polygon
  10. def add_annotations(cells, qpimage):
  11.     """
  12.    Save the contours and a prediction value for each cell
  13.    directly in the QuPath project.
  15.    cells: list of annotated cells on the image (each cell is
  16.            a dict with a contour and a pred)
  17.    """
  19.     for cell in cells:
  20.         polygon = Polygon(cell["contour"])
  21.         measurement = {"Prediction": cell["pred"]}
  22.         qpimage.hierarchy.add_detection(roi=polygon, measurements=measurement)
  25. point1={"contour":[[886, 8], [883, 42], [884, 43], [891, 8], [886, 8]],"pred":4.61}
  26. point2={"contour": [[303, 20], [294, 30], [294, 36], [295, 37], [296, 40], [303, 20]],"pred":1.25}
  28. with QuPathProject('./example_project', mode='w') as qpout:
  29.     image_fname = 'example.tif'
  30.     entry = qpout.add_image(image_fname, image_type=QuPathImageType.BRIGHTFIELD_H_E)
  31.     add_annotations([point1,point2],entry)

Be sure to use the add_detection method and not the add_annotation method. The add_detection method creates a Detection object (objectType = Detection) and the add_annotation method creates an Annotation object (objectType = Annotation). You can only use a classifier on a Detection object. A cell is a subtype of a Detection object.

QuPath Work – Threshold Selection

Now, in QuPath, if you took the Paquo approach, you can load the project file which will load both the image and the annotations together. If you used the GeoJSON approach, you will first need to load the image and then “drag + drop” the created GeoJSON file onto the image.

Afterward, you must create the two classes (positive/negative? yes/no?lymphocyte/other? etc) that will be used in the classification process.

Go to the Annotations tab, there you can add/modify the classes. You can choose the colors by double-clicking on a class.

Then select Classify -> Object classification -> Create single measurement classifier

Here you have to define the parameters. The measurement field must match the name chosen in the GeoJSON/Paquo file. The two classes made above must also be chosen to identify the classes as being above/below the threshold.

Do not forget the Live preview so you can see the result while moving the Threshold slider.

You can now move the window to the side and play with the slider (or enter a value manually) to view the result of the classification with different thresholds. You can as well move around the image as needed (zoom in, zoom out, panning, etc). Example below with a much larger GeoJSON file.

You can also hide or change the aspect of the annotations (for better visibility) without needing to close the classifier window:

You can find this example image, and the associated QuPath project file, GeoJson files, and everything else here!

And thats it, happy threshold selecting!

Special thanks to Jonatan Bonjour, an MS graduate student at EPFL for doing a lot of heavy lifting on this post!

One thought on “Using QuPath To Help Identify An Optimal Threshold For A Deep Or Machine Learning Classifier”

  1. Great items from you, man. I’ve remember your stuff previous
    to and you’re simply extremely great. I really like what you’ve got here,
    certainly like what you’re saying and the best way during which you are saying it.
    You are making it enjoyable and you continue to take care of to stay it sensible.

    I can not wait to learn much more from you. That is really a wonderful

Leave a Reply

Your email address will not be published. Required fields are marked *