Previously we looked at extracting annotations from Aperio Svs files. There are other image formats and annotation tools. Another commonly used tool in digital histology is ImageViewer, which makes it possible to view multi-page BigTiff image files.
In this case, we’ll assume that the annotated region of interest (ROI) is circled in a black rectangle. We had a pathologist annotate lymphocytes (blue), stroma (green) and tumor (red):
- xml_file=strrep(bigtiff_file,'.tif','.xml');
- xDoc = xmlread(xml_file);
- mkdir('subs');
Then we’ll go through all of the annotated pieces and look for ones which have a black line color. These have an upper left and bottom right corner.
- %find all rectangle regions and stick them in a struct, roi(..).ulx urx etc
- %etc
- Rois=[];
- Regions=xDoc.getElementsByTagName('Annotation'); % get a list of all the region tags
- for regioni = 0:Regions.getLength-1
- Region=Regions.item(regioni);
- if(str2double(Region.getAttribute('LineColor'))==0) % ROI
- %get a list of all the vertexes (which are in order)
- verticies=Region.getElementsByTagName('Vertex');
- ulx=str2double(verticies.item(0).getAttribute('X'));
- uly=str2double(verticies.item(0).getAttribute('Y')); %% upper left
- lrx=str2double(verticies.item(1).getAttribute('X'));
- lry=str2double(verticies.item(1).getAttribute('Y')); %% lower right
- Rois(end+1).lxlyrxry=[ulx uly lrx lry];
- end
- end
Now knowing where all of the ROIs are, we can iterate through each of the annotations and determine which ROIs it belongs to.
- num_roi=length(Rois);
- if(isempty(Rois))
- return
- end
- % loop through all remaining
- %if points are less than or greater than roi, add to roi(..).(lcolor).{i1}
- Regions=xDoc.getElementsByTagName('Annotation'); % get a list of all the region tags
- for regioni = 0:Regions.getLength-1
- Region=Regions.item(regioni);
- linecolor=str2double(Region.getAttribute('LineColor'));
- if(linecolor~=0) % not an roi.
- %get a list of all the vertexes (which are in order)
- verticies=Region.getElementsByTagName('Vertex');
- xy=zeros(verticies.getLength-1,2); %allocate space for them
- for vertexi = 0:verticies.getLength-1 %iterate through all verticies
- %get the x value of that vertex
- x=str2double(verticies.item(vertexi).getAttribute('X'));
- %get the y value of that vertex
- y=str2double(verticies.item(vertexi).getAttribute('Y'));
- xy(vertexi+1,:)=[x,y]; % finally save them into the array
- end
- %find which ROI it belongs to
- if(any((xy(:,1)>Rois(roii).lxlyrxry(1) )& ...
- (xy(:,1)<Rois(roii).lxlyrxry(3)) ... & (xy(:,2)>Rois(roii).lxlyrxry(2) )& ...
- (xy(:,2)<Rois(roii).lxlyrxry(4))))
- %found
- field=sprintf('c%d',linecolor);
- if(~isfield(Rois,field))
- Rois(roii).(sprintf('c%d',linecolor))={};
- end
- Rois(roii).(sprintf('c%d',linecolor)){end+1}=xy;
- end
- end
- end
- end
Finally, we iterate through all ROIs, extract them from the base level of the big tiff, and create separate binary masks for each selected color. In this case, blue for lymphocytes, green for stroma and red for tumor.
- % for all ROI, extract image, save, subtract corner from all points, make a
- % single mask of each color
- color_fields=fields(Rois(1));
- color_fields(~cellfun(@(x)x(1)=='c',color_fields))=[];
- for roii= 1: length(Rois)
- Rows=[Rois(roii).lxlyrxry(2) Rois(roii).lxlyrxry(4)];
- Cols=[Rois(roii).lxlyrxry(1) Rois(roii).lxlyrxry(3)];
- io=imread(bigtiff_file,'Index',3,'PixelRegion',{Rows,Cols});
- [nrow,ncol,ndim]=size(io);
- imwrite(io,sprintf('subs/%s_%d_%d.tif',bigtiff_file(1:end-4),...
- Rois(roii).lxlyrxry(2),Rois(roii).lxlyrxry(1)));
- for colors=1:length(color_fields)
- annotations=Rois(roii).(color_fields{colors});
- if(isempty(annotations))
- continue
- end
- mask=zeros(nrow,ncol);
- for ai = 1: length(annotations)
- %make a mask and add it to the current mask
- mask=mask+poly2mask(annotations{ai}(:,1)-Rois(roii).lxlyrxry(1),...
- annotations{ai}(:,2)-Rois(roii).lxlyrxry(2),nrow,ncol);
- end
- imwrite(mask,sprintf('subs/%s_%d_%d_%s.png',bigtiff_file(1:end-4), ...
- Rois(roii).lxlyrxry(2),Rois(roii).lxlyrxry(1),color_fields{colors}));
- end
- end
In the end, we’ve created 3 different masks which can then be used further down the pipeline:
Is there a way to do the same using Octave or Python than Matlab.or any other way?
Thanks,
Sid
there is really nothing matlab “specific” about this code or approach, in the sense that using available python open source libraries it could easily be re-implemented, but i don’t believe i have that code implemented at the moment. although these days i’m using almost entirely python workflows, this is one of the components that we keep matlab around for since the code is already built and debugged and only needs to be used once per dataset, it hasn’t warranted the effort in porting it over yet. if you manage to do it, i’d be very interested in the result!
Sure mate, will try to implement a python version and share the results with you.
Thanks
Actually, there is some code floating around in one of my projects which will get you really really close: https://github.com/choosehappy/HistoQC/pull/119