BloodAxe
diff --git a/‎.deepsource.toml
Lines changed: 13 additions & 0 deletions b/‎.deepsource.toml
Lines changed: 13 additions & 0 deletions
diff --git a/‎README.md
Lines changed: 117 additions & 31 deletions b/‎README.md
Lines changed: 117 additions & 31 deletions
diff --git a/‎pytorch_toolbelt/__init__.py
Lines changed: 1 addition & 1 deletion b/‎pytorch_toolbelt/__init__.py
Lines changed: 1 addition & 1 deletion
diff --git a/‎pytorch_toolbelt/inference/ensembling.py
Lines changed: 91 additions & 0 deletions b/‎pytorch_toolbelt/inference/ensembling.py
Lines changed: 91 additions & 0 deletions
@@ -0,0 +1,13 @@
+version = 1
+
+test_patterns = [
+  "tests/**",
+  "test_*.py"
+]
+
+[[analyzers]]
+name = "python"
+enabled = true
+
+  [analyzers.meta]
+  runtime_version = "3.x.x"
@@ -2,7 +2,7 @@
 
 [![Build Status](https://travis-ci.org/BloodAxe/pytorch-toolbelt.svg?branch=develop)](https://travis-ci.org/BloodAxe/pytorch-toolbelt)
 [![Documentation Status](https://readthedocs.org/projects/pytorch-toolbelt/badge/?version=latest)](https://pytorch-toolbelt.readthedocs.io/en/latest/?badge=latest)
-
+[![DeepSource](https://static.deepsource.io/deepsource-badge-light-mini.svg)](https://deepsource.io/gh/BloodAxe/pytorch-toolbelt/?ref=repository-badge)
 
 A `pytorch-toolbelt` is a Python library with a set of bells and whistles for PyTorch for fast R&D prototyping and Kaggle farming:
 
@@ -25,57 +25,135 @@ During 2018 I achieved a [Kaggle Master](https://www.kaggle.com/bloodaxe) badge
 Very often I found myself re-using most of the old pipelines over and over again. 
 At some point it crystallized into this repository. 
 
-This lib is not meant to replace catalyst / ignite / fast.ai. Instead it's designed to complement them.
+This lib is not meant to replace catalyst / ignite / fast.ai high-level frameworks. Instead it's designed to complement them.
 
 # Installation
 
 `pip install pytorch_toolbelt`
 
-# Showcase
+# How do I ... 
+
+## Model creation
 
-## Encoder-decoder models construction
+### Create Encoder-Decoder U-Net model
 
+Below a code snippet that creates vanilla U-Net model for binary segmentation. 
+By design, both encoder and decoder produces a list of tensors, from fine (high-resolution, indexed `0`) to coarse (low-resolution) feature maps. 
+Access to all intermediate feature maps is beneficial if you want to apply deep supervision losses on them or encoder-decoder of object detection task, 
+where access to intermediate feature maps is necessary.
+ 
 ```python
+from torch import nn
 from pytorch_toolbelt.modules import encoders as E
 from pytorch_toolbelt.modules import decoders as D
 
-class FPNSegmentationModel(nn.Module):
-    def __init__(self, encoder:E.EncoderModule, num_classes, fpn_features=128):
-        self.encoder = encoder
-        self.decoder = D.FPNDecoder(encoder.output_filters, fpn_features=fpn_features)
-        self.fuse = D.FPNFuse()
-        input_channels = sum(self.decoder.output_filters)
-        self.logits = nn.Conv2d(input_channels, num_classes,kernel_size=1)
-        
-    def forward(self, input):
-        features = self.encoder(input)
-        features = self.decoder(features)
-        features = self.fuse(features)
-        logits = self.logits(features)
-        return logits
-        
-def fpn_resnext50(num_classes):
-  encoder = E.SEResNeXt50Encoder()
-  return FPNSegmentationModel(encoder, num_classes)
-  
-def fpn_mobilenet(num_classes):
-  encoder = E.MobilenetV2Encoder()
-  return FPNSegmentationModel(encoder, num_classes)
+class UNet(nn.Module):
+    def __init__(self, input_channels, num_classes):
+        super().__init__()
+        self.encoder = E.UnetEncoder(in_channels=input_channels, out_channels=32, growth_factor=2)
+        self.decoder = D.UNetDecoder(self.encoder.channels, decoder_features=32)
+        self.logits = nn.Conv2d(self.decoder.channels[0], num_classes, kernel_size=1)
+
+    def forward(self, x):
+        x = self.encoder(x)
+        x = self.decoder(x)
+        return self.logits(x[0])
 ```
 
-## Compose multiple losses
+### Create Encoder-Decoder FPN model with pretrained encoder
+
+Similarly to previous example, you can change decoder to FPN with contatenation. 
+
+ ```python
+from torch import nn
+from pytorch_toolbelt.modules import encoders as E
+from pytorch_toolbelt.modules import decoders as D
+
+class SEResNeXt50FPN(nn.Module):
+    def __init__(self, num_classes, fpn_channels):
+        super().__init__()
+        self.encoder = E.SEResNeXt50Encoder()
+        self.decoder = D.FPNCatDecoder(self.encoder.channels, fpn_channels)
+        self.logits = nn.Conv2d(self.decoder.channels[0], num_classes, kernel_size=1)
+
+    def forward(self, x):
+        x = self.encoder(x)
+        x = self.decoder(x)
+        return self.logits(x[0])
+```
+
+### Change number of input channels for the Encoder
+
+All encoders from `pytorch_toolbelt` supports changing number of input channels. Simply call `encoder.change_input_channels(num_channels)` and first convolution layer will be changed.
+Whenever possible, existing weights of convolutional layer will be re-used (in case new number of channels is greater than default, new weight tensor will be padded with randomly-initialized weigths).
+Class method returns `self`, so this call can be chained.
+
+
+```python
+from pytorch_toolbelt.modules import encoders as E
+
+encoder = E.SEResnet101Encoder()
+encoder = encoder.change_input_channels(6)
+```
+
+
+## Misc
+
+
+## Count number of parameters in encoder/decoder and other modules
+
+When designing a model and optimizing number of features in neural network, I found it's quite useful to print number of parameters in high-level blocks (like `encoder` and `decoder`).
+Here is how to do it with `pytorch_toolbelt`:
+
+
+```python
+from torch import nn
+from pytorch_toolbelt.modules import encoders as E
+from pytorch_toolbelt.modules import decoders as D
+from pytorch_toolbelt.utils import count_parameters
+
+class SEResNeXt50FPN(nn.Module):
+    def __init__(self, num_classes, fpn_channels):
+        super().__init__()
+        self.encoder = E.SEResNeXt50Encoder()
+        self.decoder = D.FPNCatDecoder(self.encoder.channels, fpn_channels)
+        self.logits = nn.Conv2d(self.decoder.channels[0], num_classes, kernel_size=1)
+
+    def forward(self, x):
+        x = self.encoder(x)
+        x = self.decoder(x)
+        return self.logits(x[0])
+
+net = SEResNeXt50FPN(1, 128)
+print(count_parameters(net))
+# Prints {'total': 34232561, 'trainable': 34232561, 'encoder': 25510896, 'decoder': 8721536, 'logits': 129}
+
+```
+
+### Compose multiple losses
+
+There are multiple ways to combine multiple losses, and high-level DL frameworks like Catalyst offers way more flexible way to achieve this, but here's 100%-pure PyTorch implementation of mine:
 
 ```python
 from pytorch_toolbelt import losses as L
 
+# Creates a loss function that is a weighted sum of focal loss 
+# and lovasz loss with weigths 1.0 and 0.5 accordingly.
 loss = L.JointLoss(L.FocalLoss(), 1.0, L.LovaszLoss(), 0.5)
 ```
 
-## Test-time augmentation
+
+## TTA / Inferencing
+
+### Apply Test-time augmentation (TTA) for the model
+
+Test-time augmetnation (TTA) can be used in both training and testing phases. 
 
 ```python
 from pytorch_toolbelt.inference import tta
 
+model = UNet()
+
 # Truly functional TTA for image classification using horizontal flips:
 logits = tta.fliplr_image2label(model, input)
 
@@ -87,11 +165,19 @@ tta_model = tta.TTAWrapper(model, tta.fivecrop_image2label, crop_size=512)
 logits = tta_model(input)
 ```
 
-## Inference on huge images:
+### Inference on huge images:
+
+Quite often, there is a need to perform image segmentation for enormously big image (5000px and more). There are a few problems with such a big pixel arrays:
+ 1. There are size limitations on maximum size of CUDA tensors (Concrete numbers depends on driver and GPU version)
+ 2. Heavy CNNs architectures may eat up all available GPU memory with ease when inferencing relatively small 1024x1024 images, leaving no room to bigger image resolution.
+  
+One of the solutions is to slice input image into tiles (optionally overlapping) and feed each through model and concatenate the results back. 
+In this way you can guarantee upper limit of GPU ram usage, while keeping ability to process arbitrary-sized images on GPU.
+  
 
 ```python
 import numpy as np
-import torch
+from torch.utils.data import DataLoader
 import cv2
 
 from pytorch_toolbelt.inference.tiles import ImageSlicer, CudaTileMerger
@@ -102,7 +188,7 @@ image = cv2.imread('really_huge_image.jpg')
 model = get_model(...)
 
 # Cut large image into overlapping tiles
-tiler = ImageSlicer(image.shape, tile_size=(512, 512), tile_step=(256, 256), weight='pyramid')
+tiler = ImageSlicer(image.shape, tile_size=(512, 512), tile_step=(256, 256))
 
 # HCW -> CHW. Optionally, do normalization here
 tiles = [tensor_from_rgb_image(tile) for tile in tiler.split(image)]
 
@@ -1,3 +1,3 @@
 from __future__ import absolute_import
 
-__version__ = "0.3.1"
+__version__ = "0.3.2"
@@ -0,0 +1,91 @@
+from torch import nn, Tensor
+from typing import List, Union
+
+__all__ = ["ApplySoftmaxTo", "ApplySigmoidTo", "Ensembler", "PickModelOutput"]
+
+
+class ApplySoftmaxTo(nn.Module):
+    def __init__(self, model, output_key: Union[str, List[str]] = "logits", dim=1):
+        super().__init__()
+        output_key = output_key if isinstance(output_key, (list, tuple)) else [output_key]
+        # By converting to set, we prevent double-activation by passing output_key=["logits", "logits"]
+        self.output_keys = set(output_key)
+        self.model = model
+        self.dim = dim
+
+    def forward(self, input):
+        output = self.model(input)
+        for key in self.output_keys:
+            output[key] = output[key].softmax(dim=1)
+        return output
+
+
+class ApplySigmoidTo(nn.Module):
+    def __init__(self, model, output_key: Union[str, List[str]] = "logits"):
+        super().__init__()
+        output_key = output_key if isinstance(output_key, (list, tuple)) else [output_key]
+        # By converting to set, we prevent double-activation by passing output_key=["logits", "logits"]
+        self.output_keys = set(output_key)
+        self.model = model
+
+    def forward(self, input):  # skipcq: PYL-W0221
+        output = self.model(input)
+        for key in self.output_keys:
+            output[key] = output[key].sigmoid()
+        return output
+
+
+class Ensembler(nn.Module):
+    """
+    Computes sum of outputs for several models with arithmetic averaging (optional).
+    """
+
+    def __init__(self, models: List[nn.Module], average=True, outputs=None):
+        """
+
+        :param models:
+        :param average:
+        :param outputs: Name of model outputs to average and return from Ensembler.
+            If None, all outputs from the first model will be used.
+        """
+        super().__init__()
+        self.outputs = outputs
+        self.models = nn.ModuleList(models)
+        self.average = average
+
+    def forward(self, x):  # skipcq: PYL-W0221
+        output_0 = self.models[0](x)
+        num_models = len(self.models)
+
+        if self.outputs:
+            keys = self.outputs
+        else:
+            keys = output_0.keys()
+
+        for index in range(1, num_models):
+            output_i = self.models[index](x)
+
+            # Sum outputs
+            for key in keys:
+                output_0[key] += output_i[key]
+
+        if self.average:
+            for key in keys:
+                output_0[key] /= num_models
+
+        return output_0
+
+
+class PickModelOutput(nn.Module):
+    """
+    Assuming you have a model that outputs a dictionary, this module returns only a given element by it's key
+    """
+
+    def __init__(self, model: nn.Module, key: str):
+        super().__init__()
+        self.model = model
+        self.target_key = key
+
+    def forward(self, input) -> Tensor:
+        output = self.model(input)
+        return output[self.target_key]
Original file line number	Diff line number	Diff line change
`@@ -1,3 +1,3 @@`
`1`	`1`	`from __future__ import absolute_import`
`2`	`2`
`3`		`-__version__ = "0.3.1"`
	`3`	`+__version__ = "0.3.2"`