A real-time AI-based intrusion detection system designed specifically for Industrial Control Systems (ICS). This project implements a scientifically-grounded methodology based on the ICS-Flow research framework, faithfully following peer-reviewed approaches and extending them toward a real-time, deployable IDS prototype with online inference and monitoring capabilities.
Rather than designing an IDS in an ad-hoc manner, this project:
- Follows scientific methodology from peer-reviewed research
- Uses official ICS-Flow dataset (Kaggle-published benchmark)
- Employs paper-specified tools (ICSSIM testbed, ICSFlowGenerator)
- Extends offline experiments to real-time deployment
- Acknowledges limitations (dataset/domain shift, environment dependency)
This ensures methodological rigor, reproducibility, and meaningful comparison with published results.
- Scientific Foundation
- Architecture
- Installation
- Quick Start
- Project Structure
- Real-Time Deployment
- Dashboard
- Performance
- Research Perspective
- Configuration
- Testing with ICSSIM
- To-Do
- References
- Author
This project is primarily based on the following scientific work:
Alireza Dehlaghi-Ghadim, Mahshid Helali Moghadam, Ali Balador, Hans Hansson
"Anomaly Detection Dataset for Industrial Control Systems"
arXiv:2305.09678, 2023
Read Paper
The paper introduces the ICS-Flow dataset, a realistic benchmark for evaluating machine-learning-based intrusion detection systems in ICS environments, and proposes a complete methodology covering:
- ICS testbed design
- Attack implementation
- Flow-based feature extraction
- Labeling strategies
- Supervised and unsupervised ML evaluation
The IDS is trained and evaluated using the publicly available ICS-Flow dataset, released by the paper's authors.
Dataset Properties:
- Generated from a realistic ICS simulation (bottle-filling factory)
- Built using the ICSSIM framework
- Contains:
- Raw PCAP files
- Flow-level network features
- Process state variables
- Attack logs
Attack Scenarios Included:
- Reconnaissance: IP scan, Port scan
- Replay attacks
- Distributed Denial of Service (DDoS)
- Man-in-the-Middle (MITM): False data injection
Dataset Access:
Kaggle: ICS-Flow Dataset
Note: This project does not modify the dataset semantics and uses it as intended by the authors.
The reference paper applies mRMR (Minimum Redundancy Maximum Relevance) feature selection to reduce dimensionality and retain the most informative network flow features.
Following the paper exactly:
- 23 flow-level features are selected
- Only features with mRMR score ≥ 0.07 are retained
- No additional features introduced
This ensures methodological consistency and enables meaningful comparison with the paper's results.
The paper proposes two labeling strategies:
- Injection Timing (IT)
- Network Security Tools (NST)
For intrusion detection, the task is formulated as binary classification:
0→ Normal flow1→ Attack flow
This project:
- Uses the NST binary label (NST_B_Label)
┌─────────────────────────────────────────────────────────────┐
│ ICSSIM Dockerized ICS Testbed │
│ (PLCs, HMIs, Sensors, Actuators, Modbus/TCP) │
└──────────────────────┬──────────────────────────────────────┘
│ Network Traffic
▼
┌─────────────────────────────────────────────────────────────┐
│ ICSFlowGenerator (Flow Extraction Tool) │
│ • Sniffs packets from docker bridge (br_icsnet) │
│ • Converts raw packets → flow-level features │
│ • Outputs: CSV with 23 mRMR-selected features │
└──────────────────────┬──────────────────────────────────────┘
│ sniffed.csv
▼
┌─────────────────────────────────────────────────────────────┐
│ Real-Time Sensor (realtime_sensor.py) │
│ • Polls CSV every 2 seconds │
│ • Preprocesses: normalize + encode protocol │
│ • Runs ML inference (Random Forest / Isolation Forest) │
│ • Stores high-confidence attacks (≥95%) in SQLite │
└──────────────────────┬──────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ SQLite Database (ids_events.db) │
│ Table: alerts │
│ Schema: timestamp, src_ip, dst_ip, protocol, confidence │
└──────────────────────┬──────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Streamlit Dashboard (app.py) │
│ • Traffic overview metrics │
│ • Real-time alerts table │
│ • Auto-refresh (5 seconds) │
└─────────────────────────────────────────────────────────────┘
Key Design Principle: The IDS uses the same flow extraction tool (ICSFlowGenerator) for both:
- Offline training (on ICS-Flow dataset)
- Online deployment (on live ICSSIM traffic)
This ensures feature compatibility between training and deployment environments.
git clone https://github.com/morchidy/AI-Based-IDS-for-ICS.git
cd AI-Based-IDS-for-ICSpip install -r requirements.txtPlace your ICS-Flow dataset in data/raw/Dataset.csv
python src/preprocessing/preprocess_dataset.pyThis will:
- Extract 23 mRMR-selected features
- Split data (50% train, 20% val, 30% test)
- Normalize features using MinMaxScaler
- Save artifacts to
models/artifacts/
Random Forest (Supervised):
python src/training/train_rf.pyIsolation Forest (Unsupervised):
python src/training/train_if.pypython src/inference/realtime_sensor.pyThe sensor will:
- Monitor network flow CSV file
- Detect attacks in real-time
- Store alerts in SQLite database
streamlit run dashboard/app.pyAccess the dashboard at: http://localhost:8501
AI-Based-IDS-for-ICS/
│
├── data/
│ ├── raw/ # Original datasets
│ │ └── Dataset.csv
│ ├── processed/ # Train/val/test splits
│ │ ├── X_train.csv
│ │ ├── X_val.csv
│ │ ├── X_test.csv
│ │ └── y_*.csv
│ └── ids_events.db # SQLite database (alerts)
│
├── models/
│ ├── artifacts/ # Preprocessing objects
│ │ ├── minmax_scaler.pkl
│ │ ├── protocol_encoder.pkl
│ │ └── selected_features.json
│ ├── supervised/
│ │ └── rf_model.pkl # Random Forest model
│ └── unsupervised/
│ └── if_model.pkl # Isolation Forest model
│
├── src/
│ ├── preprocessing/
│ │ └── preprocess_dataset.py # Feature engineering
│ ├── training/
│ │ ├── train_rf.py # Random Forest training
│ │ └── train_if.py # Isolation Forest training
│ │ └── train_ann.py # ANN training
│ │ └── train_autoencoder.py # Autoencoder training
│ └── inference/
│ └── realtime_sensor.py # Live monitoring
│ └── model_adapters.py # Unified interface for different ML model types
│
├── dashboard/
│ └── app.py # Streamlit dashboard
│
├── README.md
├── requirements.txt
└── .gitignore
While the reference paper focuses on offline evaluation, this project extends the methodology to real-time intrusion detection by integrating with the same testbed used to generate the dataset.
What is ICSSIM?
ICSSIM is a framework for building ICS security testbeds, introduced by the same authors in:
"ICSSIM — A Framework for Building Industrial Control Systems Security Testbeds"
Architecture:
- Dockerized ICS simulation
- Simulates bottle-filling factory
- PLCs, HMIs, sensors, actuators
- Modbus/TCP communication protocol
- Reproducible attack execution
Why ICSSIM?
Testing the IDS on the same platform that generated the training dataset ensures:
- Feature compatibility
- Realistic attack scenarios
- Reproducible evaluation
- Minimal dataset/domain shift
Offline (Training):
# Convert PCAPs → Flow CSV for model training
ICSFlowGenerator --input dataset.pcap --output flows.csvOnline (Deployment):
# Sniff live traffic from docker bridge and generate flows
ICSFlowGenerator --interface br_icsnet --output sniffed.csv --liveKey Benefit:
Using the same flow extraction tool for both training and deployment guarantees that:
- Feature semantics match exactly
- Preprocessing is consistent
- Models generalize correctly
The Streamlit dashboard provides real-time visualization of IDS activity:
-
Traffic Overview
-
Recent Alerts Table
-
Auto-Refresh
Test Set Results:
| Metric | Value |
|---|---|
| Accuracy | 99.53% |
| Precision | 98.28% |
| Recall | 99.37% |
| F1-Score | 98.82% |
Confusion Matrix:
| Predicted Normal | Predicted Attack | |
|---|---|---|
| Actual Normal | TN: 10,965 | FP: 47 |
| Actual Attack | FN: 17 | TP: 2,687 |
Test Set Results:
| Metric | Value |
|---|---|
| Accuracy | 88.33% |
| Precision | 90.23% |
| Recall | 45.78% |
| F1-Score | 60.75% |
Confusion Matrix:
| Predicted Normal | Predicted Attack | |
|---|---|---|
| Actual Normal | TN: 10,878 | FP: 134 |
| Actual Attack | FN: 1,466 | TP: 1,238 |
Consistent with the paper's discussion, this project explicitly acknowledges fundamental challenges in ICS intrusion detection:
1. Environment Dependency
- IDS performance varies across different ICS environments
- Models trained on one testbed may not generalize perfectly
- Network topology, protocols, and traffic patterns affect detection
2. Dataset/Domain Shift
- Training data (ICSSIM) vs. real-world ICS networks
- Attack diversity in training vs. deployment
- Temporal concept drift (evolving attack techniques)
Adjust in src/inference/realtime_sensor.py:
monitor_realtime(
confidence_threshold=0.95 # 95% confidence required for alert
)Edit dashboard/app.py:
time.sleep(5) # Refresh every 5 seconds
st.rerun()This section describes how to test the IDS in a realistic ICS environment using the ICSSIM testbed and ICSFlowGenerator for live traffic capture.
| Tool | Description | Link |
|---|---|---|
| ICSSIM | Dockerized ICS testbed (Modbus/TCP) | GitHub |
| ICSFlowGenerator | Converts raw packets to CSV flow records | Provided with the paper |
| Docker & Docker Compose | Required to run ICSSIM | docs.docker.com |
The original ICSFlowGenerator code may require two fixes before it works correctly:
Fix 1: ModuleNotFoundError
In three files, remove the src. prefix from import paths:
# Files to fix:
# cicflowmeter/flow_session.py
# cicflowmeter/features/context/packet_direction.py
# cicflowmeter/features/context/packet_flow_key.py
# Before:
from src.cicflowmeter.flow import Flow
# After:
from cicflowmeter.flow import Flow
Apply the same pattern (src.cicflowmeter... → cicflowmeter...) to all broken imports in those three files.
Fix 2: Dependency Version Constraint
In ICSFlowGenerator's requirements.txt, change strict version pinning to minimum version:
# Before:
scapy~=2.5.0
# After:
scapy>=2.5.0
This avoids version conflicts on newer systems.
Testing requires 5 terminals running simultaneously. Open them all before starting.
cd /path/to/ICSSIM/deployments
./init.shWait until all containers are running and the Modbus communication is active. You should see log output from the PLC, HMI, and network components.
cd /path/to/AI-Based-IDS-for-ICS
python src/inference/realtime_sensor.py \
-i /path/to/ICSFlowGenerator/output/sniffed.csv \
-m models/supervised/rf_model.pkl \
-t 0.95The sensor will poll the CSV file for new flow records and run inference. It prints alerts to the console and saves them to the SQLite database.
cd /path/to/AI-Based-IDS-for-ICS
streamlit run dashboard/app.pyOpen http://localhost:8501 in your browser to monitor alerts in real time.
cd /path/to/ICSFlowGenerator
sudo python3 src/ICSFlowGenerator.py sniff --source br_icsnet \
--interval 0.5 --target_file output/sniffed --use_port TrueNote: Replace
br_icsnetwith the correct network interface. Useip aorifconfigto find the interface connected to the ICSSIM Docker network.
ICSFlowGenerator captures packets and converts them into CSV flow records that the IDS sensor reads.
Enter the attacker container and run an attack script:
# Enter the attacker container
cd /path/to/ICSSIM/
docker exec -it attacker bash
# Inside the container, run an attack
cd attacks
# In this directory, there are .sh files to run attacks (ip scan, port scan, man in the middle, reply attack, dos)Once the attack is running, you should observe the following chain of events:
- ICSFlowGenerator captures the attack packets and writes new rows to
sniffed.csv - Realtime Sensor detects new flows, runs the ML model, and flags suspicious traffic
- SQLite Database stores the alerts with timestamp, IPs, protocol, and confidence
- Dashboard displays the alerts in real time with metrics and table view
ICSFlowGenerator → sniffed.csv → Realtime Sensor → ids_events.db → Dashboard
(packets) (flows) (inference) (alerts) (UI)
| Component | Screenshot |
|---|---|
| ICSSIM Running | ![]() |
| ICSFlowGenerator | ![]() |
| Realtime Sensor | ![]() |
| Dashboard | ![]() |
| Database | ![]() |
- Create model comparison script
- Create Docker container
- Add CI/CD pipeline
Alireza Dehlaghi-Ghadim, Mahshid Helali Moghadam, Ali Balador, Hans Hansson
"Anomaly Detection Dataset for Industrial Control Systems"
arXiv:2305.09678, 2023
https://arxiv.org/abs/2305.09678
ICSSIM Framework:
"ICSSIM — A Framework for Building Industrial Control Systems Security Testbeds"
By the same authors
ICS-Flow Dataset:
Kaggle
- ICSFlowGenerator: Official flow extraction tool (paper-provided)
- ICSSIM: Dockerized ICS testbed framework
Morchid Youssef
- GitHub: @morchidy
- Project: AI-Based-IDS-for-ICS
- Email: morchidy33@gmail.com




