Working with Aperio SVS files in Matlab – Converting Annotations to Binary Masks

One of the main purposes of having a digital format is to allow experts (e.g., pathologists) to annotate certain structures in the images. Be it nuclei, epithelium/stroma regions, tumor/non-tumor tissue etc. This is easily done with ImageScope and SVS files, but the trick is importing them into Matlab.

In this case, we’ll run through a quick 2 region mask to give the general approach and code structure needed to create binary masks. Here we can see 2 green annotations made using the pen tool:


ImageScope saves the annotation x,y coordinates relative to the base image of the pyramid (as discussed here). It stores them in an xml file of the same name as the original file except with an xml extension. So the steps are rather straight forward from there

  1. Parse xml file
  2. Extract x,y coordinates
  3. Make binary mask using x,y points

One caveat, which we discussed in the previous tutorial, is that the base image is likely too large to be loaded into memory. This means that making a binary image of the same size as the binary image is also unlikely to work. To compensate we’ll scale down the binary mask to a different layer in the image pyramid, one which can/does fit easily into memory.

Looking at the XML

Lets see what the xml looks like:


We can see some interesting pieces of information, such as the microns per pixel, but we’re more focused on the region annotations. Here we can see two regions, with ID=”1” and ID=”2” (lines 14 and 392 respectively). The attributes associated with the region contain things like length, area, etc. More importantly, we can see region 2 is made up of a set of vertices consisting of x,y coordinates. Our goal is to be able to extract these in an orderly fashion.

Extracting the XML

Since the xml is rather well structured, we can use the underlying xerces dom class to extract the data. The code looks something like this:

  1. svs_file='TCGA-A1-A0SD-01Z-00-DX1.DB17BFA9-D951-42A8-91D2-F4C2EBC6EB9F.svs';
  2. xml_file='TCGA-A1-A0SD-01Z-00-DX1.DB17BFA9-D951-42A8-91D2-F4C2EBC6EB9F.xml';
  5. xDoc = xmlread(xml_file);
  6. Regions=xDoc.getElementsByTagName('Region'); % get a list of all the region tags
  7. for regioni = 0:Regions.getLength-1
  8.     Region=Regions.item(regioni);  % for each region tag
  10.     %get a list of all the vertexes (which are in order)
  11.     verticies=Region.getElementsByTagName('Vertex');
  12.     xy{regioni+1}=zeros(verticies.getLength-1,2); %allocate space for them
  13.     for vertexi = 0:verticies.getLength-1 %iterate through all verticies
  14.         %get the x value of that vertex
  15.         x=str2double(verticies.item(vertexi).getAttribute('X'));
  17.         %get the y value of that vertex
  18.         y=str2double(verticies.item(vertexi).getAttribute('Y'));
  19.         xy{regioni+1}(vertexi+1,:)=[x,y]; % finally save them into the array
  20.     end
  22. end


Afterwards, we can plot the results using code like this:

  1. figure,hold all
  2. set(gca,'YDir','reverse'); %invert y axis
  3. for zz=1:length(xy)
  4.     plot(xy{zz}(:,1),xy{zz}(:,2),'LineWidth',12)
  5. end

The only issue is that we need to invert the y-axis because images have the upper left corner as (0,0) and matlab’s default plotting uses the bottom left corner as the origin. The result of the plotting look like this, which are exactly the same shape as our original annotations (!) :




Making the Mask

Now the issue is that we have a plot but not a binary mask. Maybe we can just use them directly, lets try to allocate the necessary memory:



Okay that’s not going to work, we have the same issue as we’ve discussed before. The base image is simply too large to fit into memory, and while the binary mask is 1/3 the size (single channel versus RGB), its still too large for my 16GB of RAM.


So what if we still want to look at it? Well we can apply the same trick as before and scale the points to match a smaller pyramidal layer.

  1. svsinfo=imfinfo(svs_file);
  2. s=1; %base level of maximum resolution
  3. s2=5; % down sampling of 1:32
  4. hratio=svsinfo(s2).Height/svsinfo(s).Height;  %determine ratio
  5. wratio=svsinfo(s2).Width/svsinfo(s).Width;
  7. nrow=svsinfo(s2).Height;
  8. ncol=svsinfo(s2).Width;
  9. mask=zeros(nrow,ncol); %pre-allocate a mask
  10. for zz=1:length(xy) %for each region
  11.     smaller_x=xy{zz}(:,1)*wratio; %down sample the region using the ratio
  12.     smaller_y=xy{zz}(:,2)*hratio;
  14.     %make a mask and add it to the current mask
  15.     %this addition makes it obvious when more than 1 layer overlap each
  16.     %other, can be changed to simply an OR depending on application.    
  17.     mask=mask+poly2mask(smaller_x,smaller_y,nrow,ncol);
  18. end

From here we can simply display the mask with imshow(mask):


And see that the interior is indeed the same as what was marked above in the ImageScope version.

All done!

Full source available here

27 thoughts on “Working with Aperio SVS files in Matlab – Converting Annotations to Binary Masks”

    1. I’ve never had access to such a machine so unfortunately I don’t know :-\ If you find out, I’d appreciate any insights you may discover!

  1. Have you used only the level s3=3 for svs spliting in all of your researches?

    Is it a way to make precise xml mask for base level s=1
    maybe by:

    1)Making a window sized 20*20 pixels.
    2)Iterate it through mask at resolution s2=5
    4)Rescale poly for a piece of svs, backwards to s=1
    multipling components of mask2poly_for_window to 1/hratio(..) and 1/wratio(…) correspondingly.

    We would have enough memory, because we save not a huge mask of all svs (9*10^4×9*10^4) but a small piece of it.
    The price is a big amount of iterations.

    Is there any need in such procedure or basically we should need more maxpools further if we use the same operating memory?

    1. we use all levels depending on the necessary magnifications.

      That said, for very large images (WSI), we’ve found that its far easier to extract smaller regions which are manageable in size, allowing for easier parallelization and viewing. Reading and writing individual tiles tends to be extremely slow, and error prone as it requires a number of wrappers to address the various formats/styles (openslide, etc). That is to say, even if I had a 100,000 x 100,000 mask for an associated 100k x 100k image, I’m not entirely sure what I would do with it 🙂

  2. Line 10 in the last code block, no need to use the loop. The poly2mask function does things in one go.

    Great work any way. Very helpful.

    1. i think you’re missing the case where there is more than 1 annotated region. poly2mask will *connect* all of the vertices in the list, thus merging together two regions. the for loop is to generate them separately and combined

  3. As you presented here, we can finally view the binary mask of a smaller pyramidal layer. But our goal is to get the ROI of the base level image and then divide the ROI to patches of fixed size. Could you give me some advice on how to realize my goal?

    1. Not sure i understand your question? if you know where the ROI is, why not just use imread(filename,’Index’,base_index,’PixelRegion’,{[start_row end_row],[start_col end_col]});

  4. I should have described my question clearer. I mean how to extract
    ROI from the base level of the SVS image? As you described ,we can first open the SVS image with the Aperio ImageScope and delineate the ROI. Next we can extract x,y coordinates which is saved as a .XML file. What confused me was that could I use this XML file to extract the ROI on the base level image(because you say we can just view the binary mask within a smaller pyramidal layer )

  5. We have annotations with different colors for the purpose of highlighting different regions of interest (ex: In Situ, Invasive, Benign, Normal). Can we isolate these annotations and create a mask for themselves?
    It’ll be far too messy for a mask to have all these annotations existing at the same time.

    1. certainly: the “linecolor” attribute in the “annotation” object can be used to tell the different annotations apart. the super simplest way is probably to make a single script per color of interest

      1. Thanks 🙂
        I think this narrowed down my problem a little. However, now when I run the code, I don’t receive an annotated mask but just a black image with no regions suggested from the XML file. Have you encountered this issue?

        1. hmmm no i have not… you should walk through the code step by step in the debugger and make sure that at each step you’re extracting the data that you think you are. if nothing is extracted you’ll be producing a black mask.

          1. I’ve tested the code with much simpler annotations. The mask was extracted with no problem except for the the numbered choice of ‘s2’. It helped make the mask have the desired dimensions we wanted. How did you choose the values of ‘s’ and ‘s2’?

          2. experience : ) the anntoations themselves are typically stored in coordinates relating to the base magnification. the second choice is based on what size image will fit in computer memory and/or what i need for the particular project. so while 1 is fixed, the other should be able to modified seamlessly (hopefully)

  6. Is it possible the base image is too large for Matlab to read? svsinfo doesn’t display the base resolution for some unknown reason, but it shows other layers. The base resolution for this specific .svs file is 157158×81323 (quite large).

    1. sure, its possible. you should try loading a smaller level to get some indication if the file is corrupt or not. regardless, you’ll always be limited by the size of the ram in the machine, but one can easily figure out the size of the image since it must be stored in an uncompressed matrix format. so in your case 157k x 81k x 3 (RGB channels) = ~40GB of ram

      1. I see. I tried reducing the original .svs file and crop out unnecessary space where annotations are not involved and Matlab finally picked up the base image.
        Of course the mask must be down-scaled, but for when you want to train images with its corresponding mask as you did in the USE CASES, don’t you want the histology slide image and mask to be proportional to each other (equal pixel size)?

        By the way, thank you very much for being patient with my questions. I really appreciate it 🙂

        1. absolutely, if there is not a 1 to 1 correspondence between the spatial locations in the mask and the associated pixel in the original image, then the training won’t be exposed to the correct labels. things will go poorly very very quickly 🙂 in the use cases, when i downsampled the mask, i also downsampled the image by the same factor, so they’re both the same size at the same resolution. one nice thing about the mask is that by definition it can take 1/3rd * 1/8th the space by storing it as a single channel binary image (e.g., bool), so loading it into memory becomes a lot easier. from there, its possible to find potential regions of interest and pull up just those smaller regions from the larger-than-memory image

          1. Some of the down-sampled dimensions taken from the .svs structure are drastic. I have a base image of 103,947 x 81,323, and the next largest down-sample option is 25,986 x 20330. If we reduce the original .svs WSI to these dimensions as well, won’t we lose a lot of information on the original image (for when we want to train)?

          2. not sure i understand what you’re suggested? downsampling will always results in information loss. if the image is too big, loading and shrinking it in smaller tiles and patching them together is another option

          3. Can you briefly explain/share the code (if available) of how you downscaled the WSI image by the same factor? I’m quite new to Matlab so I’m still learning the ropes 🙁

Leave a Reply

Your email address will not be published. Required fields are marked *