[R] shareable module architecture

### MNIST-CNN Integration Module — PRD (mnist-int)

#### Summary
Create a standalone, reusable Gradle/Kotlin Multiplatform module that makes it trivial for apps to load and run the pre-trained MNIST CNN model. The module abstracts model format IO via `io-core`, uses `io-gguf` for GGUF parameter loading (initial implementation), embeds a pre-trained GGUF binary as a resource, and exposes a coroutine/Flow-based API for lifecycle and inference. This enables future swaps of model storage formats without changing the consuming app code.

#### Problem Statement
- Apps currently need to know about model loading, resource handling, and execution details for MNIST CNN.
- Model format/loader dependencies leak into apps, increasing coupling.
- No consistent lifecycle or shared-instance pattern for loading and executing the model across app features.

#### Goals
- Provide an app-friendly module that:
  - Loads a pre-trained MNIST CNN model with progress reporting.
  - Executes inference on 1x28x28 grayscale images (or batched inputs) and returns logits/probabilities.
  - Offers a Flow-based shared model lifecycle: loading → ready → running → unloaded → error.
  - Hides model format and loader (GGUF initially), enabling future replacements via `io-core` abstraction.
  - Works in multiplatform settings with platform-appropriate resource loading.

#### Non-Goals
- Training the model or providing training utilities in this module.
- General-purpose CNN abstractions beyond what is already in `skainet-lang-models`.
- Full-featured image pre/post-processing pipelines (provide minimal helpers only).

#### Users
- App developers in `skainet-apps` (e.g., KGPChat demo or other apps) who need MNIST classification without dealing with model internals.

---

### Current Capabilities and Architecture (as-is)
- `skainet-lang-models` defines MNIST CNN network structures (`sk/ainet/lang/model/dnn/cnn/MNIST.kt`) and helpers (e.g., grayscale conversion).
- `skainet-io-gguf` provides GGUF parameter loading; the project hints at a progress-capable loader.
- `io-core` abstraction exists (in `skainet-io`) to unify parameter load sources independently of file format.
- Prior RFC/notes in `mnist-gguf.md` outline a Shared/Flow-based lifecycle and resource packaging approach.

Gaps:
- No dedicated integration module that:
  - Bundles the model file as a resource.
  - Wires the CNN definition to the GGUF loader via `io-core`.
  - Exposes a stable, simple API surface for apps.

---

### Proposal: New Module `skainet-int/skainet-int-mnist-cnn`
A KMP module delivering a single responsibility: load and run the pre-trained MNIST CNN with minimal setup, using Flow to expose lifecycle, and abstracting IO.

#### High-level Architecture
```mermaid
flowchart LR
  A[App/UI Layer] -->|observe state, call run| B[MnistCnnShared]
  B -->|build network| C[MnistCnnModule]
  C -->|load params via| D[io-core Loader API]
  D -->|GGUF impl| E[io-gguf]
  C -->|ops| F[ExecutionContext]
  C -->|model def| G[skainet-lang-models]
  C -->|resource bytes| H[(embedded gguf)]
```

- `MnistCnnShared` provides a coroutine-safe shared instance for model loading and inference, exposing `StateFlow<MnistCnnSharedState>`.
- `MnistCnnModule` encapsulates the assembled network and forward pass.
- `io-core` defines the loader interface (format-agnostic). `io-gguf` is the initial concrete implementation.
- The GGUF file is embedded as a resource; platform-specific resource access is abstracted.

#### Module Boundaries
- Public API: `MnistCnnShared`, `MnistCnnSharedState`, minimal helpers for input shaping.
- Internal: `MnistCnnModule` assembly, resource accessors, loader wiring.
- Dependencies:
  - `skainet-lang-models` for MNIST CNN topology and tensor types.
  - `skainet-io:io-core` for loader abstractions.
  - `skainet-io:io-gguf` for GGUF implementation (runtime dep; swappable later).
  - Kotlin coroutines/Flow.

---

### Detailed Design

#### Public API (commonMain)
```kotlin
package sk.ainet.int.mnist

import kotlinx.coroutines.flow.StateFlow
import sk.ainet.lang.tensor.*
import sk.ainet.lang.dtype.FP32
import sk.ainet.lang.exec.ExecutionContext

sealed interface MnistCnnSharedState {
    data object Unloaded : MnistCnnSharedState
    data class Loading(val current: Long, val total: Long, val message: String?) : MnistCnnSharedState
    data class Ready(val module: MnistCnnModule) : MnistCnnSharedState
    data class Running(val current: Long, val total: Long, val message: String?) : MnistCnnSharedState
    data class Error(val throwable: Throwable) : MnistCnnSharedState
}

class MnistCnnShared(
    private val scope: CoroutineScope,
    private val execContext: ExecutionContext,
    private val loader: ModelParameterLoader = DefaultModelParameterLoader // from io-core
) {
    val state: StateFlow<MnistCnnSharedState>

    suspend fun load(resourcePath: String = DEFAULT_GGUF_PATH)
    suspend fun run(input: Tensor<FP32, Float>): Tensor<FP32, Float>
    suspend fun unload()

    companion object {
        const val DEFAULT_GGUF_PATH: String = "/models/mnist/mnist-cnn-f32.gguf"
    }
}
```

- The `loader` is an `io-core` interface; default implementation delegates to `io-gguf`.
- `ExecutionContext` is passed in to align with the app/backend selection.

#### Internal Assembly
```kotlin
// internal
class MnistCnnModule(
    private val net: MnistCnn, // network from skainet-lang-models
    private val execContext: ExecutionContext
) {
    suspend fun forward(x: Tensor<FP32, Float>): Tensor<FP32, Float> = net.forward(x, execContext)
}
```

```kotlin
// internal resource access (common expect API)
expect fun readResourceBytes(path: String): ByteArray
```

Platform-specific `actual` implementations are provided in `jvmMain`, `androidMain`, `iosMain`, etc.

#### Loader Abstraction
```kotlin
// io-core (existing in skainet-io)
interface ModelParameterLoader {
    suspend fun load(
        target: ParameterTarget,
        bytes: ByteArray,
        onProgress: (current: Long, total: Long, message: String?) -> Unit = { _, _, _ -> }
    )
}
```

The module calls `loader.load(target=net, bytes=resourceBytes, onProgress=...)`. `io-gguf` implements this for GGUF.

#### State and Concurrency
- Use a `Mutex` to guard `load/run/unload` to prevent concurrent mutation.
- Emit progress via `StateFlow<MnistCnnSharedState>`.
- Always restore `Ready` after `run`, or set `Error` if exceptions occur.

#### Resource Packaging
- Place model file at `src/commonMain/resources/models/mnist/mnist-cnn-f32.gguf`.
- File is included in published artifacts on JVM/Android and bundled appropriately for other targets.
- Provide tools script or task to validate presence and checksum of resource.

---

### Example Usage in an App
```kotlin
class MyViewModel(
    private val execContext: ExecutionContext,
    private val scope: CoroutineScope
) {
    private val shared = MnistCnnShared(scope, execContext)
    val state: StateFlow<MnistCnnSharedState> = shared.state

    suspend fun init() = shared.load()
    suspend fun classify(x: Tensor<FP32, Float>) = shared.run(x)
}
```

---

### Module Layout and Gradle

- Module path: `skainet-int/skainet-int-mnist-cnn`
- Source sets: `commonMain`, `commonTest`, `jvmMain`, `androidMain`, `iosMain`, `jsMain` (as needed)

Gradle (KMP) sketch:
```kotlin
plugins {
    kotlin("multiplatform")
    `maven-publish`
}

kotlin {
    jvm()
    androidTarget() // if Android
    ios()
    js(IR) // optional

    sourceSets {
        val commonMain by getting {
            dependencies {
                api(projects.skainetLang.skainetLangModels)
                implementation(projects.skainetIo.ioCore)
                implementation(projects.skainetIo.ioGguf) // runtime impl
                implementation(libs.coroutines.core)
            }
        }
        val commonTest by getting {
            dependencies { implementation(kotlin("test")) }
        }
        val jvmMain by getting {}
        val androidMain by getting {}
        // iosMain/jsMain as needed
    }
}
```

Update `settings.gradle.kts`:
```kotlin
include(":skainet-int:skainet-int-mnist-cnn")
```

---

### Sequence: Load → Run → Unload
```mermaid
sequenceDiagram
  participant App
  participant Shared as MnistCnnShared
  participant Loader as ModelParameterLoader (io-gguf)
  participant Net as MnistCnnModule

  App->>Shared: load()
  Shared->>Shared: state=Loading(...)
  Shared->>Shared: readResourceBytes(gguf)
  Shared->>Loader: load(target=net, bytes, onProgress)
  Loader-->>Shared: progress callbacks
  Loader-->>Shared: OK
  Shared->>Shared: state=Ready(Net)

  App->>Shared: run(input)
  Shared->>Shared: state=Running
  Shared->>Net: forward(input)
  Net-->>Shared: output
  Shared->>Shared: state=Ready
  Shared-->>App: output

  App->>Shared: unload()
  Shared->>Shared: module=null; state=Unloaded
```

---

### Data Contracts
- Input: `Tensor<FP32, Float>` shaped `[1, 1, 28, 28]` or batched `[N, 1, 28, 28]`.
- Output: `Tensor<FP32, Float>` shaped `[N, 10]` logits. Helper can apply softmax for probabilities.

Optional helpers:
```kotlin
fun asMnistInput(gray: FloatArray): Tensor<FP32, Float> { /* normalize, shape to [1,1,28,28] */ }
fun argmax(logits: Tensor<FP32, Float>): Int { /* returns predicted digit 0..9 */ }
```

---

### Error Handling & Telemetry
- Map loader or IO exceptions to `MnistCnnSharedState.Error`.
- Provide simple metrics hooks (time to load, time to run, bytes loaded) via optional callback or logger.

---

### Performance Considerations
- Avoid re-parsing GGUF on every run; keep `MnistCnnModule` in memory until `unload()`.
- Allow apps to supply `ExecutionContext` selecting CPU/GPU backends.
- Consider lazy variant for cold start vs. fully-eager load.

---

### Security & Licensing
- Verify license for distributing pre-trained MNIST CNN weights in GGUF; include NOTICE if required.
- Ensure no PII; model is public domain friendly.

---

### Testing Strategy
- Unit tests (common):
  - State transitions: `Unloaded -> Loading -> Ready -> Running -> Ready -> Unloaded`.
  - Error pathways when resource missing or corrupt.
- Integration tests (JVM):
  - Load real GGUF resource and classify a known digit sample; assert top-1 correct.
  - Progress callback emits increasing `current` up to `total`.
- Resource test: checksum validation of the embedded GGUF file.

---

### Migration & Rollout
- Add module in repo; enable publication if relevant.
- Update one reference app (e.g., `skainet-apps/...`) to use `MnistCnnShared` instead of ad-hoc loading.
- Document minimal usage in README of the new module.

---

### Implementation Plan (Step-by-step)
1. Create module `skainet-int/skainet-int-mnist-cnn` (KMP skeleton).
2. Add dependencies on `skainet-lang-models`, `skainet-io:io-core`, `skainet-io:io-gguf`, and coroutines.
3. Add `MnistCnnSharedState`, `MnistCnnShared`, and internal `MnistCnnModule`.
4. Define `expect fun readResourceBytes(path: String): ByteArray` in common; provide `actual` implementations per platform.
5. Place `mnist-cnn-f32.gguf` under `src/commonMain/resources/models/mnist/` with checksum.
6. Wire loader call: `loader.load(target = net, bytes = readResourceBytes(DEFAULT_GGUF_PATH), onProgress = ...)`.
7. Implement concurrency guards (`Mutex`) and state Flow.
8. Add helper functions for input shaping and softmax/argmax.
9. Write unit tests for state transitions and error cases.
10. Write JVM integration test: load real model and classify sample.
11. Update `settings.gradle.kts` and publish tasks as needed.
12. Update an app to consume the new module and remove duplicated logic.

---

### Open Questions
- Exact `io-core` loader interfaces in this repo: confirm signature to align progress callback and target parameter mapping.
- Which platforms must be supported in v1 (JVM-only vs. Android/iOS/JS)?
- Preferred execution backend defaults (CPU vs. auto-detect GPU when present).
- Model file size constraints for mobile store distribution.

---

### Acceptance Criteria
- Module builds and publishes.
- App can load MNIST CNN with a single `load()` call, observe progress via `StateFlow`.
- Inference works with expected accuracy on standard MNIST test samples.
- GGUF is an internal detail; replacing GGUF with another loader requires no app code changes.
- Documentation includes quickstart and API reference with example.


[R] shareable module architecture #198

Description

MNIST-CNN Integration Module — PRD (mnist-int)

Summary

Problem Statement

Goals

Non-Goals

Users

Current Capabilities and Architecture (as-is)

Proposal: New Module skainet-int/skainet-int-mnist-cnn

High-level Architecture

Module Boundaries

Detailed Design

Public API (commonMain)

Internal Assembly

Loader Abstraction

State and Concurrency

Resource Packaging

Example Usage in an App

Module Layout and Gradle

Sequence: Load → Run → Unload

Data Contracts

Error Handling & Telemetry

Performance Considerations

Security & Licensing

Testing Strategy

Migration & Rollout

Implementation Plan (Step-by-step)

Open Questions

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Proposal: New Module `skainet-int/skainet-int-mnist-cnn`