DarthReca
/

actu-change-detection

+## Third-Party Software
+This project incorporates components from the following third-party software:
+### ConvLSTM (MIT License)
+The code ConvLSTM is used in this project. The original license is as follows:
+---
+MIT License
+Copyright (c) 2022 Seyong Kim
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

README.md CHANGED Viewed

@@ -1,3 +1,68 @@
----
-license: openrail
----

+---
+license: openrail
+datasets:
+- DarthReca/hydro-chronos
+pipeline_tag: image-segmentation
+tags:
+- climate
+- geospatial
+- remote-sensing
+- spatiotemporal
+- multi-modal
+- earth-observation
+- time-series
+- hydrology
+library_name: transformers
+---
+# ACTU for Magnitude Regression
+<!-- Provide a quick summary of what the model is/does. -->
+This is ACTU for pixelwise regression of MNDWI.
+## Model Details
+<!-- Provide a longer summary of what this model is. -->
+This architecture is a temporal UNet (with ConvLSTMs), featuring an LSTM branch to process climate timeseries and a gating mechanism.
+It is designed to receive a timeseries of Sentinel-2 images, DEM, and timeseries of climate variables and output a single real mask of future MNDWI.
+- **Developed by:** Daniele Rege Cambrin
+- **Model type:** ACTU
+- **License:** OpenRAIL
+- **Repository:** [Github](https://github.com/DarthReca/hydro-chronos)
+- **Paper:** [Arxiv](https://arxiv.org/abs/2506.14362)
+## How to Get Started with the Model
+The model is integrated into Transformers, so you can easily load it with the following code:
+```python
+AutoModel.from_pretrained("DarthReca/actu-magnitude-regression", trust_remote_code=True, revision=<model_type>)
+```
+Load the model with the desired configuration with the *revision* parameter (the branches of this repo). These configurations are available:
+| Revision    | Backbone        | DEM | Climate |
+|-------------|-----------------|:---:|:-------:|
+| main        | ConvNeXtV2 Base |  No  |    No    |
+| dem-climate | ConvNeXtV2 Base |  Yes  |    Yes    |
+## Training Details
+The model is pre-trained on Landsat-5 images and fine-tuned on Sentinel-2 of HydroChronos.
+## Citation
+```bibtex
+@misc{cambrin2025hydrochronosforecastingdecadessurface,
+      title={HydroChronos: Forecasting Decades of Surface Water Change},
+      author={Daniele Rege Cambrin and Eleonora Poeta and Eliana Pastor and Isaac Corley and Tania Cerquitelli and Elena Baralis and Paolo Garza},
+      year={2025},
+      eprint={2506.14362},
+      archivePrefix={arXiv},
+      primaryClass={cs.CV},
+      url={https://arxiv.org/abs/2506.14362},
+}
+```
+## Licensing
+The project uses third-party software. For detailed information on the licensing of each component, please see the [**NOTICE.md**](NOTICE.md) file.

convlstm.py ADDED Viewed

	@@ -0,0 +1,209 @@

+# Copyright (c) 2022 Seyong Kim
+from typing import Any, Optional, Tuple, Union
+import torch
+from torch import Tensor, nn, sigmoid, tanh
+class ConvGate(nn.Module):
+    def __init__(
+        self,
+        in_channels: int,
+        hidden_channels: int,
+        kernel_size: Union[Tuple[int, int], int],
+        padding: Union[Tuple[int, int], int],
+        stride: Union[Tuple[int, int], int],
+        bias: bool,
+    ):
+        super(ConvGate, self).__init__()
+        self.conv_x = nn.Conv2d(
+            in_channels=in_channels,
+            out_channels=hidden_channels * 4,
+            kernel_size=kernel_size,
+            padding=padding,
+            stride=stride,
+            bias=bias,
+        )
+        self.conv_h = nn.Conv2d(
+            in_channels=hidden_channels,
+            out_channels=hidden_channels * 4,
+            kernel_size=kernel_size,
+            padding=padding,
+            stride=stride,
+            bias=bias,
+        )
+        self.bn2d = nn.BatchNorm2d(hidden_channels * 4)
+    def forward(self, x, hidden_state):
+        gated = self.conv_x(x) + self.conv_h(hidden_state)
+        return self.bn2d(gated)
+class ConvLSTMCell(nn.Module):
+    def __init__(
+        self, in_channels, hidden_channels, kernel_size, padding, stride, bias
+    ):
+        super().__init__()
+        # To check the model structure with tools such as torchinfo, need to wrap
+        # the custom module with nn.ModuleList
+        self.gates = nn.ModuleList(
+            [ConvGate(in_channels, hidden_channels, kernel_size, padding, stride, bias)]
+        )
+    def forward(
+        self, x: Tensor, hidden_state: Tensor, cell_state: Tensor
+    ) -> Tuple[Tensor, Tensor]:
+        gated = self.gates[0](x, hidden_state)
+        i_gated, f_gated, c_gated, o_gated = gated.chunk(4, dim=1)
+        i_gated = sigmoid(i_gated)
+        f_gated = sigmoid(f_gated)
+        o_gated = sigmoid(o_gated)
+        cell_state = f_gated.mul(cell_state) + i_gated.mul(tanh(c_gated))
+        hidden_state = o_gated.mul(tanh(cell_state))
+        return hidden_state, cell_state
+class ConvLSTM(nn.Module):
+    """ConvLSTM module"""
+    def __init__(
+        self,
+        in_channels,
+        hidden_channels,
+        kernel_size,
+        padding,
+        stride,
+        bias,
+        batch_first,
+        bidirectional,
+    ):
+        super().__init__()
+        self.in_channels = in_channels
+        self.hidden_channels = hidden_channels
+        self.bidirectional = bidirectional
+        self.batch_first = batch_first
+        # To check the model structure with tools such as torchinfo, need to wrap
+        # the custom module with nn.ModuleList
+        self.conv_lstm_cells = nn.ModuleList(
+            [
+                ConvLSTMCell(
+                    in_channels, hidden_channels, kernel_size, padding, stride, bias
+                )
+            ]
+        )
+        if self.bidirectional:
+            self.conv_lstm_cells.append(
+                ConvLSTMCell(
+                    in_channels, hidden_channels, kernel_size, padding, stride, bias
+                )
+            )
+        self.batch_size = None
+        self.seq_len = None
+        self.height = None
+        self.width = None
+    def forward(
+        self, x: Tensor, state: Optional[Tuple[Tensor, Tensor]] = None
+    ) -> Tuple[Tensor, Tuple[Tensor, Tensor]]:
+        # size of x: B, T, C, H, W or T, B, C, H, W
+        x = self._check_shape(x)
+        hidden_state, cell_state, backward_hidden_state, backward_cell_state = (
+            self.init_state(x, state)
+        )
+        output, hidden_state, cell_state = self._forward(
+            self.conv_lstm_cells[0], x, hidden_state, cell_state
+        )
+        if self.bidirectional:
+            x = torch.flip(x, [1])
+            backward_output, backward_hidden_state, backward_cell_state = self._forward(
+                self.conv_lstm_cells[1], x, backward_hidden_state, backward_cell_state
+            )
+            output = torch.cat([output, backward_output], dim=-3)
+            hidden_state = torch.cat([hidden_state, backward_hidden_state], dim=-1)
+            cell_state = torch.cat([cell_state, backward_cell_state], dim=-1)
+        return output, (hidden_state, cell_state)
+    def _forward(self, lstm_cell, x, hidden_state, cell_state):
+        outputs = []
+        for time_step in range(self.seq_len):
+            x_t = x[:, time_step, :, :, :]
+            hidden_state, cell_state = lstm_cell(x_t, hidden_state, cell_state)
+            outputs.append(hidden_state.detach())
+        output = torch.stack(outputs, dim=1)
+        return output, hidden_state, cell_state
+    def _check_shape(self, x: Tensor) -> Tensor:
+        if self.batch_first:
+            batch_size, self.seq_len = x.shape[0], x.shape[1]
+        else:
+            batch_size, self.seq_len = x.shape[1], x.shape[0]
+            x = x.permute(1, 0, 2, 3)
+            x = torch.swapaxes(x, 0, 1)
+        self.height = x.shape[-2]
+        self.width = x.shape[-1]
+        dim = len(x.shape)
+        if dim == 4:
+            x = x.unsqueeze(dim=1)  # increase dimension
+            x = x.view(batch_size, self.seq_len, -1, self.height, self.width)
+            x = x.contiguous()  # Reassign memory location
+        elif dim <= 3:
+            raise ValueError(
+                f"Got {len(x.shape)} dimensional tensor. Input shape unmatched"
+            )
+        return x
+    def init_state(
+        self, x: Tensor, state: Optional[Tuple[Tensor, Tensor]]
+    ) -> Tuple[Union[Tensor, Any], Union[Tensor, Any], Optional[Any], Optional[Any]]:
+        # If state doesn't enter as input, initialize state to zeros
+        backward_hidden_state, backward_cell_state = None, None
+        if state is None:
+            self.batch_size = x.shape[0]
+            hidden_state, cell_state = self._init_state(x.dtype, x.device)
+            if self.bidirectional:
+                backward_hidden_state, backward_cell_state = self._init_state(
+                    x.dtype, x.device
+                )
+        else:
+            if self.bidirectional:
+                hidden_state, hidden_state_back = state[0].chunk(2, dim=-1)
+                cell_state, cell_state_back = state[1].chunk(2, dim=-1)
+            else:
+                hidden_state, cell_state = state
+        return hidden_state, cell_state, backward_hidden_state, backward_cell_state
+    def _init_state(self, dtype, device):
+        self.register_buffer(
+            "hidden_state",
+            torch.zeros(
+                (1, self.hidden_channels, self.height, self.width),
+                dtype=dtype,
+                device=device,
+            ),
+        )
+        self.register_buffer(
+            "cell_state",
+            torch.zeros(
+                (1, self.hidden_channels, self.height, self.width),
+                dtype=dtype,
+                device=device,
+            ),
+        )
+        return self.hidden_state, self.cell_state

modeling_actu.py ADDED Viewed

	@@ -0,0 +1,332 @@

+from dataclasses import dataclass
+import numpy as np
+import timm
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from einops import rearrange, repeat
+from segmentation_models_pytorch.base import SegmentationHead
+from segmentation_models_pytorch.decoders.unet.decoder import UnetDecoder
+from timm.layers.create_act import create_act_layer
+from transformers import PretrainedConfig, PreTrainedModel
+from transformers.modeling_outputs import SemanticSegmenterOutput
+from .convlstm import ConvLSTM
+class ACTUConfig(PretrainedConfig):
+    model_type = "actu"
+    def __init__(
+        self,
+        # Base ACTU parameters
+        in_channels: int = 3,
+        kernel_size: tuple[int, int] = (3, 3),
+        padding="same",
+        stride=(1, 1),
+        backbone="resnet34",
+        bias=True,
+        batch_first=True,
+        bidirectional=False,
+        original_resolution=(256, 256),
+        act_layer="sigmoid",
+        n_classes=1,
+        # Variant control parameters
+        use_dem_input: bool = False,
+        use_climate_branch: bool = False,
+        # Climate branch parameters
+        climate_seq_len=5,
+        climate_input_dim=6,
+        lstm_hidden_dim=128,
+        num_lstm_layers=1,
+        **kwargs,
+    ):
+        super().__init__(**kwargs)
+        self.in_channels = in_channels
+        self.kernel_size = kernel_size
+        self.padding = padding
+        self.stride = stride
+        self.backbone = backbone
+        self.bias = bias
+        self.batch_first = batch_first
+        self.bidirectional = bidirectional
+        self.original_resolution = original_resolution
+        self.act_layer = act_layer
+        self.n_classes = n_classes
+        # Parameters to control variants
+        self.use_dem_input = use_dem_input
+        self.use_climate_branch = use_climate_branch
+        self.climate_seq_len = climate_seq_len
+        self.climate_input_dim = climate_input_dim
+        self.lstm_hidden_dim = lstm_hidden_dim
+        self.num_lstm_layers = num_lstm_layers
+        # Adjust in_channels if DEM is used
+        if self.use_dem_input:
+            self.in_channels += 1
+class ACTUForImageSegmentation(PreTrainedModel):
+    config_class = ACTUConfig
+    def __init__(self, config: ACTUConfig):
+        super().__init__(config)
+        self.config = config
+        self.encoder: nn.Module = timm.create_model(
+            config.backbone, features_only=True, in_chans=config.in_channels
+        )
+        with torch.no_grad():
+            dummy_input_channels = config.in_channels
+            dummy_input = torch.randn(
+                1, dummy_input_channels, *config.original_resolution, device=self.device
+            )
+            embs = self.encoder(dummy_input)
+            self.embs_shape = [e.shape for e in embs]
+            self.encoder_channels = [e[1] for e in self.embs_shape]
+        self.convlstm = nn.ModuleList(
+            [
+                ConvLSTM(
+                    in_channels=shape[1],
+                    hidden_channels=shape[1],
+                    kernel_size=config.kernel_size,
+                    padding=config.padding,
+                    stride=config.stride,
+                    bias=config.bias,
+                    batch_first=config.batch_first,
+                    bidirectional=config.bidirectional,
+                )
+                for shape in self.embs_shape
+            ]
+        )
+        if self.config.use_climate_branch:
+            self.climate_branch = ClimateBranchLSTM(
+                output_shapes=[e[1:] for e in self.embs_shape],
+                lstm_hidden_dim=config.lstm_hidden_dim,
+                climate_seq_len=config.climate_seq_len,
+                climate_input_dim=config.climate_input_dim,
+                num_lstm_layers=config.num_lstm_layers,
+            )
+            self.fusers = nn.ModuleList(
+                GatedFusion(enc, enc) for enc in self.encoder_channels
+            )
+        self.decoder = UnetDecoder(
+            encoder_channels=[1] + self.encoder_channels,
+            decoder_channels=self.encoder_channels[::-1],
+            n_blocks=len(self.encoder_channels),
+        )
+        self.seg_head = nn.Sequential(
+            SegmentationHead(
+                in_channels=self.encoder_channels[0],
+                out_channels=config.n_classes,
+            ),
+            create_act_layer(config.act_layer, inplace=True),
+        )
+    def forward(
+        self,
+        pixel_values: torch.Tensor,
+        climate: torch.Tensor = None,
+        dem: torch.Tensor = None,
+        labels: torch.Tensor = None,
+        **kwargs,
+    ) -> SemanticSegmenterOutput:
+        b, t = pixel_values.shape[:2]
+        original_size = pixel_values.shape[-2:]
+        # Handle DEM input
+        if self.config.use_dem_input:
+            if dem is None:
+                raise ValueError(
+                    "DEM tensor must be provided when use_dem_input is True."
+                )
+            dem_repeated = repeat(dem, "b c h w -> b t c h w", t=t)
+            pixel_values = torch.cat([pixel_values, dem_repeated], dim=2)
+        # 1. Encode images per time step
+        encoded_sequence = self._encode_images(pixel_values)
+        # 2. Handle Climate Branch Fusion
+        if self.config.use_climate_branch:
+            if climate is None:
+                raise ValueError(
+                    "Climate tensor must be provided when use_climate_branch is True."
+                )
+            climate_features = self.climate_branch(climate)
+            # Reshape for fusion
+            encoded_sequence_reshaped = [
+                rearrange(f, "b t c h w -> (b t) c h w") for f in encoded_sequence
+            ]
+            climate_features_reshaped = [
+                rearrange(f, "b t c h w -> (b t) c h w") for f in climate_features
+            ]
+            # Fuse features
+            fused_features = [
+                fuser(img, clim)
+                for fuser, img, clim in zip(
+                    self.fusers, encoded_sequence_reshaped, climate_features_reshaped
+                )
+            ]
+            # Reshape back to sequence
+            encoded_sequence = [
+                rearrange(f, "(b t) c h w -> b t c h w", b=b) for f in fused_features
+            ]
+        # 3. Process sequence with ConvLSTM
+        temporal_features = self._encode_timeseries(encoded_sequence)
+        # 4. Decode to get the segmentation map
+        logits = self._decode(temporal_features, size=original_size)
+        loss = None
+        if labels is not None:
+            loss_fct = nn.CrossEntropyLoss()
+            loss = loss_fct(logits, labels.float().unsqueeze(1))
+        return SemanticSegmenterOutput(
+            loss=loss,
+            logits=logits,
+        )
+    def _encode_images(self, x: torch.Tensor) -> list[torch.Tensor]:
+        B = x.size(0)
+        encoded_frames = self.encoder(rearrange(x, "b t c h w -> (b t) c h w"))
+        return [
+            rearrange(frames, "(b t) c h w -> b t c h w", b=B)
+            for frames in encoded_frames
+        ]
+    def _encode_timeseries(self, timeseries: torch.Tensor) -> list[torch.Tensor]:
+        outs = []
+        for convlstm, encoded in reversed(list(zip(self.convlstm, timeseries))):
+            lstm_out, (_, _) = convlstm(encoded)
+            outs.append(lstm_out[:, -1, :, :, :])
+        return outs
+    def _decode(self, x: torch.Tensor, size: tuple[int, int]) -> torch.Tensor:
+        trend_map = self.decoder(*[None] + x[::-1])
+        trend_map = self.seg_head(trend_map)
+        trend_map = F.interpolate(
+            trend_map, size=size, mode="bilinear", align_corners=False
+        )
+        return trend_map
+class ClimateBranchLSTM(nn.Module):
+    """
+    Processes climate time series data using an LSTM.
+    Input shape: (B, T, T_1, C_clim) -> e.g., (B, 5, 6, 5)
+    Output shape: (B, T, output_dim) -> e.g., (B, 5, 128)
+    """
+    def __init__(
+        self,
+        output_shapes: list[tuple[int, int, int]],
+        climate_input_dim=5,
+        climate_seq_len=6,
+        lstm_hidden_dim=64,
+        num_lstm_layers=1,
+    ):
+        super().__init__()
+        self.climate_seq_len = climate_seq_len
+        self.climate_input_dim = climate_input_dim
+        self.lstm_hidden_dim = lstm_hidden_dim
+        self.num_lstm_layers = num_lstm_layers
+        self.proj_dim = 128
+        self.output_shapes = output_shapes
+        self.lstm = nn.LSTM(
+            input_size=climate_input_dim,
+            hidden_size=lstm_hidden_dim,
+            num_layers=num_lstm_layers,
+            batch_first=True,  # Crucial: expects input shape (batch, seq_len, features)
+            dropout=0.3 if num_lstm_layers > 1 else 0,
+            bidirectional=False,
+        )
+        # Linear layer to project LSTM output to the desired final dimension
+        self.fc = nn.Linear(lstm_hidden_dim, self.proj_dim)
+        self.upsamples = nn.ModuleList(
+            _build_upsampler(self.proj_dim, *shape[:2]) for shape in output_shapes
+        )
+    def forward(self, climate_data: torch.Tensor) -> list[torch.Tensor]:
+        # climate_data shape: (B, T, T_1, C_clim), e.g., (B, 5, 6, 5)
+        B_img, B_cli, T, C = climate_data.shape
+        # Reshape for LSTM: Treat each sequence independently
+        lstm_input = rearrange(climate_data, "Bi Bc T C -> (Bi Bc) T C")
+        # Pass through LSTM
+        _, (hidden, _) = self.lstm.forward(lstm_input)
+        # Get the last layer's hidden state
+        last_hidden = (
+            hidden[[hidden.size(0) // 2, -1]] if self.lstm.bidirectional else hidden[-1]
+        )
+        if last_hidden.ndim == 3:
+            last_hidden = hidden.mean(dim=0)
+        # Pass the final hidden state through the fully connected layer(s) and upsample
+        climate_features = self.fc(last_hidden)
+        climate_features = rearrange(climate_features, "b c -> b c 1 1")
+        climate_features = [
+            rearrange(
+                u(climate_features), "(Bi Bc) C H W -> Bi Bc C H W", Bi=B_img, Bc=B_cli
+            )
+            for u in self.upsamples
+        ]
+        return climate_features
+class GatedFusion(nn.Module):
+    def __init__(self, img_channels, clim_channels):
+        super().__init__()
+        self.gate = nn.Sequential(
+            nn.Sequential(
+                nn.Conv2d(
+                    img_channels + clim_channels, img_channels, kernel_size=3, padding=1
+                ),
+                nn.ReLU(inplace=True),
+                nn.Conv2d(img_channels, img_channels, kernel_size=1),
+                nn.Sigmoid(),  # Gate values between 0 and 1
+            )
+        )
+    def forward(self, img_feat, clim_feat):
+        gate = self.gate(torch.cat([img_feat, clim_feat], dim=1))
+        return gate * img_feat + (1 - gate) * clim_feat
+def _build_upsampler(
+    in_channels: int, target_channels: int, target_h: int
+) -> nn.Sequential:
+    layers = []
+    current_h = 1
+    # Expand to target channels early (e.g., 1x1 → 1x1 with target_channels)
+    layers += [nn.Conv2d(in_channels, target_channels, kernel_size=1), nn.GELU()]
+    # Upsample spatially to target_h
+    while current_h < target_h:
+        next_h = min(current_h * 2, target_h)
+        layers += [
+            nn.Upsample(scale_factor=2, mode="nearest"),
+            nn.Conv2d(target_channels, target_channels, kernel_size=3, padding=1),
+            nn.GELU(),
+        ]
+        current_h = next_h
+    return nn.Sequential(*layers)