Skip to content

Conversation

quentinblampey
Copy link
Collaborator

Adding a function remove_overlap to remove overlaps between polygons. Experimental feature.

@quentinblampey quentinblampey merged commit c486145 into main Jun 2, 2025
7 checks passed
@NadineBestard
Copy link

Hello,
I tested this today, and cloning the main branch fails at 'from sopa.segmentation.shapes import expand_radius'. I noticed https://github.com/gustaveroussy/sopa/blob/main/sopa/segmentation/shapes.py is not available, however it is in no_overlap branch. Therefore, I tested from the no_overlap branch, and the expansion seemed to work, but then it failed to aggregate.

Here is the code I used:

# https://github.com/gustaveroussy/sopa/issues/241 
from sopa.segmentation.shapes import expand_radius

# save cell and nuclei shapes layers
# the nucs are the same as the stardist boundaries
sdata['nuclei_boundaries'] = sdata['stardist_boundaries'].copy() 
# create cell boundaries with expansion ( 2 -> r*2)
sdata['cell_boundaries'] = expand_radius(sdata['nuclei_boundaries'], 2, no_overlap=True) # no overlap is still experimental
Click to expand output - this worked
/home/ubuntu/anaconda3/envs/sopa.V2.0.7/lib/python3.11/site-packages/spatialdata/_core/_elements.py:105: UserWarning: Key `nuclei_boundaries` already exists. Overwriting it in-memory.
  self._check_key(key, self.keys(), self._shared_keys)
[WARNING] (sopa.segmentation.shapes) Computing Voronoi polygons to ensure no overlap between shapes is still experimental.
[WARNING] (sopa.segmentation.shapes) Computing Voronoi polygons to ensure no overlap between shapes is still experimental.
/home/ubuntu/CRM_SingleCell/Spatial/VisiumHD/src/src/sopa/sopa/segmentation/shapes.py:218: UserWarning: The indices of the left and right GeoSeries' are not equal, and therefore they will be aligned (reordering and/or introducing missing values) before executing the operation. If this alignment is the desired behaviour, you can silence this warning by passing 'align=True'. If you don't want alignment and protect yourself of accidentally aligning, you can pass 'align=False'.
  geometry = geo_df.intersection(_voronoi)
/home/ubuntu/anaconda3/envs/sopa.V2.0.7/lib/python3.11/site-packages/spatialdata/_core/_elements.py:105: UserWarning: Key `cell_boundaries` already exists. Overwriting it in-memory.
  self._check_key(key, self.keys(), self._shared_keys)
# aggregate for both shape layers
sopa.aggregate(sdata, aggregate_channels=False, expand_radius_ratio=0, image_key="cf_hires_image", shapes_key="nuclei_boundaries", key_added = "table_nuclei")
sopa.aggregate(sdata, aggregate_channels=False, expand_radius_ratio=0, image_key="cf_hires_image", shapes_key="cell_boundaries", key_added = "table_cell") 
Click to expand output - this did not work
/home/ubuntu/anaconda3/envs/sopa.V2.0.7/lib/python3.11/site-packages/anndata/_core/aligned_df.py:68: ImplicitModificationWarning: Transforming to str index.
  warnings.warn("Transforming to str index.", ImplicitModificationWarning)
/home/ubuntu/anaconda3/envs/sopa.V2.0.7/lib/python3.11/site-packages/spatialdata/_core/_elements.py:105: UserWarning: Key `nuclei_boundaries` already exists. Overwriting it in-memory.
  self._check_key(key, self.keys(), self._shared_keys)
/home/ubuntu/anaconda3/envs/sopa.V2.0.7/lib/python3.11/site-packages/anndata/_core/aligned_df.py:68: ImplicitModificationWarning: Transforming to str index.
  warnings.warn("Transforming to str index.", ImplicitModificationWarning)
/home/ubuntu/anaconda3/envs/sopa.V2.0.7/lib/python3.11/site-packages/spatialdata/_core/_elements.py:105: UserWarning: Key `cell_boundaries` already exists. Overwriting it in-memory.
  self._check_key(key, self.keys(), self._shared_keys)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[21], line 3
      1 # aggregate for both shapes layers
      2 sopa.aggregate(sdata, aggregate_channels=False, expand_radius_ratio=0, image_key="cf_hires_image", shapes_key="nuclei_boundaries", key_added = "table_nuclei")
----> 3 sopa.aggregate(sdata, aggregate_channels=False, expand_radius_ratio=0, image_key="cf_hires_image", shapes_key="cell_boundaries", key_added = "table_cell") 

File ~/CRM_SingleCell/Spatial/VisiumHD/src/src/sopa/sopa/aggregation/aggregation.py:79, in aggregate(sdata, aggregate_genes, aggregate_channels, image_key, points_key, gene_column, shapes_key, bins_key, expand_radius_ratio, min_transcripts, min_intensity_ratio, no_overlap, key_added)
     76     key_added = f"{aggr.shapes_key}_{SopaKeys.TABLE}"
     77     log.info(f"key_added is None, saving the table as '{key_added}' by default.")
---> 79 aggr.compute_table(
     80     aggregate_genes=aggregate_genes,
     81     aggregate_channels=aggregate_channels,
     82     expand_radius_ratio=expand_radius_ratio,
     83     min_transcripts=min_transcripts,
     84     min_intensity_ratio=min_intensity_ratio,
     85     gene_column=gene_column,
     86     no_overlap=no_overlap,
     87     key_added=key_added,
     88 )

File ~/CRM_SingleCell/Spatial/VisiumHD/src/src/sopa/sopa/aggregation/aggregation.py:221, in compute_table(self, aggregate_genes, aggregate_channels, gene_column, expand_radius_ratio, min_transcripts, min_intensity_ratio, no_overlap, key_added)
    210         self.table = AnnData(
    211             mean_intensities,
    212             var=pd.DataFrame(index=self.image.coords["c"].values.astype(str)),
    213             obs=pd.DataFrame(index=self.geo_df.index),
    214         )
    216 self.sdata.shapes[self.shapes_key] = self.geo_df
    218 self.table.uns[SopaKeys.UNS_KEY] = {
    219     SopaKeys.UNS_HAS_TRANSCRIPTS: aggregate_genes,
    220     SopaKeys.UNS_HAS_INTENSITIES: aggregate_channels,
--> 221 }
    223 self.add_standardized_table(key_added)

File ~/CRM_SingleCell/Spatial/VisiumHD/src/src/sopa/sopa/aggregation/aggregation.py:224, in add_standardized_table(self, key_added)
      0 <Error retrieving source code with stack_data see ipython/ipython#13598>

File ~/CRM_SingleCell/Spatial/VisiumHD/src/src/sopa/sopa/aggregation/aggregation.py:242, in add_standardized_table(sdata, table, geo_df, shapes_key, table_key, image_key)
    237     self.points_key = points_key
    239 if aggregate_genes is None:
    240     aggregate_genes = (
    241         (self.table is not None and isinstance(self.table.X, csr_matrix))
--> 242         or gene_column is not None
    243         or self.bins_key is not None
    244         or self.points_key is not None
    245     )
    247 if average_intensities is not None:
    248     log.warning(
    249         "`average_intensities` is deprecated and will be removed in sopa==2.1.0, use `aggregate_channels` instead"
    250     )

File ~/CRM_SingleCell/Spatial/VisiumHD/src/src/sopa/sopa/aggregation/aggregation.py:242, in <listcomp>(.0)
    237     self.points_key = points_key
    239 if aggregate_genes is None:
    240     aggregate_genes = (
    241         (self.table is not None and isinstance(self.table.X, csr_matrix))
--> 242         or gene_column is not None
    243         or self.bins_key is not None
    244         or self.points_key is not None
    245     )
    247 if average_intensities is not None:
    248     log.warning(
    249         "`average_intensities` is deprecated and will be removed in sopa==2.1.0, use `aggregate_channels` instead"
    250     )

AttributeError: 'NoneType' object has no attribute 'x' 

@quentinblampey
Copy link
Collaborator Author

Hi @NadineBestard, thanks for trying this. Actually, in the main branch, I moved this function to sopa.shapes.expand_radius for convenience. Sorry for the change, once I release the version 2.1.0, it will not change anymore!

Your error seems weird, as I don't see a lowercase x attribute at the line mentioned. Is this the full log?
Can you try again on the main branch? I recently pushed a fix for sparse matrices, but it doesn't seem related to the error you have...

@quentinblampey
Copy link
Collaborator Author

Also, note that there are two possibilities:

  1. What you did, i.e. expanding the cells without overlap, and then aggregating the bins
  2. Expanding the cells during the aggregation, and then ensure that each bin is mapped to only one cell, based on gene expression profile. This is different from option 1 because it uses the bins expression and not just the shapes.

If you want to try option 2, then you need to provide no_overlap to sopa.aggregate directly:

sopa.aggregate(sdata, aggregate_channels=False, expand_radius_ratio=2, image_key="cf_hires_image", shapes_key="nuclei_boundaries", key_added = "table_nuclei", no_overlap=True)

I think that it's a little bit confusing to have these two options. Maybe I should rename use a different variable name in sopa.aggregate, like unique_mapping: bool, what do you think?
Your feedback is precious, I need it to make sure the version 2.1.0 will be stable and easy to understand :)

@NadineBestard
Copy link

Thank you! It's so nice you are already implementing to look at the GEX when assigning the bins to different cells :)
I noticed there was the no_overlap option into sopa.aggregate, but indeed I did not notice it was not doing exactly the same thing.
I think most people would use the aggregate directly, as you intend, as that is what is shown in the tutorial and is the logical workflow. And no_overlap is a good argument name.
I followed the workflow 'creating shapes -> aggregating the bins in those shapes' as this is more or less what you suggested in #241 (comment) when I was wondering how to add the expanded cell shape.
If using your proposed option, with expand_radius_ratio and no_overlap passed directly into the aggregation, then how would you add the resulting shapes into the data? If no_overlap from expand_radius is different from the one in aggregate the trick used in #241 (comment) would not work.

I'll test with the main branch on Monday or so!

@NadineBestard
Copy link

NadineBestard commented Jun 9, 2025

Still the same error with the main branch. Here is the complete log

Click to expand log
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[10], line 3
      1 # aggregate for both shape layers
      2 sopa.aggregate(sdata, aggregate_channels=False, expand_radius_ratio=0, image_key="cf_hires_image", shapes_key="nuclei_boundaries", key_added = "table_nuclei")
----> 3 sopa.aggregate(sdata, aggregate_channels=False, expand_radius_ratio=0, image_key="cf_hires_image", shapes_key="cell_boundaries", key_added = "table_cell") 

File ~/CRM_SingleCell/Spatial/VisiumHD/src/sopa/sopa/aggregation/aggregation.py:79, in aggregate(sdata, aggregate_genes, aggregate_channels, image_key, points_key, gene_column, shapes_key, bins_key, expand_radius_ratio, min_transcripts, min_intensity_ratio, no_overlap, key_added)
     76     key_added = f"{aggr.shapes_key}_{SopaKeys.TABLE}"
     77     log.info(f"key_added is None, saving the table as '{key_added}' by default.")
---> 79 aggr.compute_table(
     80     aggregate_genes=aggregate_genes,
     81     aggregate_channels=aggregate_channels,
     82     expand_radius_ratio=expand_radius_ratio,
     83     min_transcripts=min_transcripts,
     84     min_intensity_ratio=min_intensity_ratio,
     85     gene_column=gene_column,
     86     no_overlap=no_overlap,
     87     key_added=key_added,
     88 )

File ~/CRM_SingleCell/Spatial/VisiumHD/src/sopa/sopa/aggregation/aggregation.py:221, in Aggregator.compute_table(self, aggregate_genes, aggregate_channels, gene_column, expand_radius_ratio, min_transcripts, min_intensity_ratio, no_overlap, key_added)
    214 self.sdata.shapes[self.shapes_key] = self.geo_df
    216 self.table.uns[SopaKeys.UNS_KEY] = {
    217     SopaKeys.UNS_HAS_TRANSCRIPTS: aggregate_genes,
    218     SopaKeys.UNS_HAS_INTENSITIES: aggregate_channels,
    219 }
--> 221 self.add_standardized_table(key_added)

File ~/CRM_SingleCell/Spatial/VisiumHD/src/sopa/sopa/aggregation/aggregation.py:224, in Aggregator.add_standardized_table(self, key_added)
    223 def add_standardized_table(self, key_added: str):
--> 224     add_standardized_table(
    225         self.sdata, self.table, self.geo_df, self.shapes_key, key_added, image_key=self.image_key
    226     )

File ~/CRM_SingleCell/Spatial/VisiumHD/src/sopa/sopa/aggregation/aggregation.py:242, in add_standardized_table(sdata, table, geo_df, shapes_key, table_key, image_key)
    238 geo_df.index = list(table.obs_names)
    240 add_spatial_element(sdata, shapes_key, geo_df)
--> 242 table.obsm["spatial"] = np.array([[centroid.x, centroid.y] for centroid in geo_df.centroid])
    243 table.obs[SopaKeys.REGION_KEY] = pd.Series(shapes_key, index=table.obs_names, dtype="category")
    244 table.obs[SopaKeys.SLIDE_KEY] = pd.Series(image_key or "None", index=table.obs_names, dtype="category")

File ~/CRM_SingleCell/Spatial/VisiumHD/src/sopa/sopa/aggregation/aggregation.py:242, in <listcomp>(.0)
    238 geo_df.index = list(table.obs_names)
    240 add_spatial_element(sdata, shapes_key, geo_df)
--> 242 table.obsm["spatial"] = np.array([[centroid.x, centroid.y] for centroid in geo_df.centroid])
    243 table.obs[SopaKeys.REGION_KEY] = pd.Series(shapes_key, index=table.obs_names, dtype="category")
    244 table.obs[SopaKeys.SLIDE_KEY] = pd.Series(image_key or "None", index=table.obs_names, dtype="category")

AttributeError: 'NoneType' object has no attribute 'x'

I tested (option2) and that works though! So it would just be a matter of sorting how to add the cell shapes back to the object, in order to visualise if the expansion looks reasonable.

@quentinblampey
Copy link
Collaborator Author

Great, now the error makes more sense. Maybe there is an invalid geometry inside sdata['cell_boundaries']. Can you run the following lines for me? It will help me debugging.

print(len(sdata["cell_boundaries"]))
print(sdata["cell_boundaries"].geometry.map(lambda x: x.__class__.__name__).value_counts())
print(sdata["cell_boundaries"].centroid.map(lambda x: x.__class__.__name__).value_counts())

how would you add the resulting shapes into the data?

The problem with this approach is that the bins associated to one cell are not necessarily always next to each other. In other words, the cell would have a very weird shape, with multiple small holes in between...

@NadineBestard
Copy link

Then, do you think the expanded nuclei shapes with expand_radius would be the closest easily representable shapes, and is a good alternative for visualising how the expansion went?

I paste below the output from your code. Do not worry much about it though, it could be nice to troubleshoot, in case it is an underlying problem that could cause other errors. But for this particular case, Option 2, the most optimal one to expand nucs, seems to work fine!

print(len(sdata["cell_boundaries"])) 
# 11250

print(sdata["cell_boundaries"].geometry.map(lambda x: x.__class__.__name__).value_counts())
#geometry
#Polygon         11178
#NoneType           64
#MultiPolygon        8
#Name: count, dtype: int64

print(sdata["cell_boundaries"].centroid.map(lambda x: x.__class__.__name__).value_counts())
#Point       11186
#NoneType       64
#Name: count, dtype: int64

@quentinblampey
Copy link
Collaborator Author

Yes, indeed, using expand_radius is a relatively good approximation I believe, for visualization purposes.

Thanks for sharing this code: it seems that 64 shapes become None after expansion, which might happen in rare cases. It's good to know, I'll try to reproduce it and fix it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants