Category Archives: Python

Serialization and storage of GeoJson in Digital Pathology

GeoJSON, a widely used format based on JSON (JavaScript Object Notation), is specifically designed for encoding a variety of geographic data structures. This versatile format excels in representing simple geographical features, such as points, lines, and polygons, along with their non-spatial attributes. In the realm of digital pathology, GeoJSON has emerged as a common format for storing annotations, enabling precise documentation of regions of interest, cellular structures, and other critical details within pathology images. The popularity of GeoJSON in this field is bolstered by its broad support across numerous tools (e.g., Qupath) and thus facilitates seamless integration and analysis in digital pathology workflows.

Despite its widespread adoption, there are several open questions regarding the efficient use of GeoJSON that can significantly impact performance. One key concern is the best method for storing GeoJSON in a compressed format to minimize storage requirements while preserving the integrity of the data. Efficient compression techniques are crucial, especially when dealing with large-scale pathology datasets.

Continue reading Serialization and storage of GeoJson in Digital Pathology

Data Exploration Of Features For Outcome Association In Digital Pathology

Introduction

In the field of digital pathology, a frequent approach for the creation of image-based biomarkers involves extracting features from scanned pathology slides. These features, which are often related to the morphology or spatial distribution of various tissue or cell types, provide valuable insights into the underlying biology of diseases. In cancer research, it is particularly important to examine how these features correlate with clinical outcomes such as overall survival (OS), progression-free survival (PFS), or other binary outcomes (e.g., response to a specific treatment).

Here we release python code that can be executed in a notebook to facilitate this process. It accepts a pandas DataFrame and generates a one-page summary PDF file, facilitating the analysis of individual features and their potential correlation with clinical outcomes.

Continue reading Data Exploration Of Features For Outcome Association In Digital Pathology

Ray: An Open-Source Api For Easy, Scalable Distributed Computing In Python – Part 3 Intro to Serving Models

Through a series of 4 blog posts, we’ll discuss and provide working examples of how one can use the open-source library Ray to (a) scale computing locally (single machine), (b) distribute scaling remotely (multiple-machines), and (c) serve deep learning models across a cluster (2 on this topic, basic/advanced). Please note that the blog posts in this series increasingly raise in difficulty!

This is the second to last blog post in the series, (the first one here, second one here), where we will go into greater detail about how we can use Ray Serve to set up a server waiting to respond to our requests for processing. These last two are the most complex blogpost in the series and require some understanding of how HTTP, REST, and web services work. You can find relevant prereading here.

Ray Serve is a scalable model serving library for building online inference APIs. Serve is framework agnostic, so you can use a single toolkit to serve everything from deep learning models built with frameworks like PyTorch, Tensorflow, and Keras, to Scikit-Learn models, to arbitrary Python business logic.

Continue reading Ray: An Open-Source Api For Easy, Scalable Distributed Computing In Python – Part 3 Intro to Serving Models

Ray: An Open-Source Api For Easy, Scalable Distributed Computing In Python – Part 1 Local Scaling

Through a series of 4 blog posts, we’ll discuss and provide working examples of how one can use the open-source library Ray to (a) scale computing locally (single machine), (b) distribute scaling remotely (multiple-machines), and (c) serve deep learning models across a cluster (basic/advanced). Please note that the blog posts in this series increasingly raise in difficulty!

I am personally very excited by the opportunities afforded by Ray, its been a long time desire to have such an easy-to-use library!

Okay, lets start off by talking about scaling local computation with Ray!

Continue reading Ray: An Open-Source Api For Easy, Scalable Distributed Computing In Python – Part 1 Local Scaling