AIComputing101
diff --git a/‎CONTRIBUTING.md‎
Lines changed: 3 additions & 3 deletions b/‎CONTRIBUTING.md‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎README.md‎
Lines changed: 30 additions & 13 deletions b/‎README.md‎
Lines changed: 30 additions & 13 deletions
diff --git a/‎docker/README.md‎
Lines changed: 12 additions & 12 deletions b/‎docker/README.md‎
Lines changed: 12 additions & 12 deletions
diff --git a/‎docker/docker-compose.yml‎
Lines changed: 2 additions & 2 deletions b/‎docker/docker-compose.yml‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docker/rocm/Dockerfile‎
Lines changed: 6 additions & 77 deletions b/‎docker/rocm/Dockerfile‎
Lines changed: 6 additions & 77 deletions
diff --git a/‎docker/scripts/build.sh‎
Lines changed: 1 addition & 1 deletion b/‎docker/scripts/build.sh‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docker/scripts/run.sh‎
Lines changed: 1 addition & 3 deletions b/‎docker/scripts/run.sh‎
Lines changed: 1 addition & 3 deletions
@@ -41,7 +41,7 @@ docker-compose up -d cuda-dev  # For NVIDIA GPUs
 docker-compose up -d rocm-dev  # For AMD GPUs
 
 # Option 2: Native development
-# Install CUDA Toolkit 12.9.1+ or ROCm 6.4.3+
+# Install CUDA Toolkit 12.9.1+ or ROCm latest
 # See modules/module1/README.md for detailed setup instructions
 
 # Build all examples
@@ -241,8 +241,8 @@ When reporting bugs, please include:
 ### Environment Information
 - **Operating System**: (Ubuntu 22.04, Windows 11, etc.)
 - **GPU**: (RTX 4090, RX 7900 XTX, etc.)
-- **Driver Version**: (NVIDIA 535.x, ROCm 6.4.3, etc.)
-- **CUDA/HIP Version**: (12.9.1, 6.4.3, etc.)
+- **Driver Version**: (NVIDIA 535.x, ROCm latest, etc.)
+- **CUDA/HIP Version**: (12.9.1, 7.0, etc.)
 - **Docker**: (if using containerized development)
 
 ### Bug Description
 
@@ -2,9 +2,9 @@
 
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
 [![CUDA](https://img.shields.io/badge/CUDA-12.9.1-76B900?logo=nvidia)](https://developer.nvidia.com/cuda-toolkit)
-[![ROCm](https://img.shields.io/badge/ROCm-6.4.3-red?logo=amd)](https://rocmdocs.amd.com/)
+[![ROCm](https://img.shields.io/badge/ROCm-7.0-red?logo=amd)](https://rocmdocs.amd.com/)
 [![Docker](https://img.shields.io/badge/Docker-Ready-2496ED?logo=docker)](https://www.docker.com/)
-[![Examples](https://img.shields.io/badge/Examples-70%2B-green)](modules/)
+[![Examples](https://img.shields.io/badge/Examples-71-green)](modules/)
 [![CI](https://img.shields.io/badge/CI-GitHub%20Actions-2088FF?logo=github-actions)](https://github.com/features/actions)
 
 **A comprehensive, hands-on educational project for mastering GPU programming with CUDA and HIP**
@@ -35,7 +35,7 @@
 **GPU Programming 101** is a complete educational resource for learning modern GPU programming. This project provides:
 
 - **9 comprehensive modules** covering beginner to expert topics
-- **70+ working code examples** in both CUDA and HIP
+- **71 working code examples** in both CUDA and HIP
 - **Cross-platform support** for NVIDIA and AMD GPUs  
 - **Production-ready development environment** with Docker
 - **Professional tooling** including profilers, debuggers, and CI/CD
@@ -197,10 +197,11 @@ This architectural knowledge is essential for writing efficient GPU code and is
 |---------|-------------|
 | 🎯 **Complete Curriculum** | 9 progressive modules from basics to advanced topics |
 | 💻 **Cross-Platform** | Full CUDA and HIP support for NVIDIA and AMD GPUs |
-| 🐳 **Docker Ready** | Complete containerized development environment |
-| 🔧 **Production Quality** | Professional build systems, testing, and profiling |
+| 🐳 **Docker Ready** | Complete containerized development environment with CUDA 12.9.1 & ROCm 7.0 |
+| 🔧 **Production Quality** | Professional build systems, auto-detection, testing, and profiling |
 | 📊 **Performance Focus** | Optimization techniques and benchmarking throughout |
 | 🌐 **Community Driven** | Open source with comprehensive contribution guidelines |
+| 🧪 **Advanced Libraries** | Support for Thrust, MIOpen, and production ML frameworks |
 
 ## 🚀 Quick Start
 
@@ -217,14 +218,14 @@ cd gpu-programming-101
 
 # Inside container: verify GPU access and start learning
 /workspace/test-gpu.sh
-cd modules/module1 && make && ./01_vector_addition_cuda
+cd modules/module1 && make && ./build/01_vector_addition_cuda
 ```
 
 ### Option 2: Native Installation
 For direct system installation:
 
 ```bash
-# Prerequisites: CUDA 11.0+ or ROCm 5.0+, GCC 7+, Make
+# Prerequisites: CUDA 12.0+ or ROCm 7.0+, GCC 9+, Make
 
 # Clone and build
 git clone https://github.com/AIComputing101/gpu-programming-101.git
@@ -265,7 +266,7 @@ Our comprehensive curriculum progresses from fundamental concepts to production-
 | [**Module 8**](modules/module8/) | 🚀 Expert | 10-12h | **Domain Applications** | ML, Scientific Computing | 4 |
 | [**Module 9**](modules/module9/) | 🚀 Expert | 6-8h | **Production Deployment** | Libraries, Integration, Scaling | 4 |
 
-**📈 Progressive Learning Path: 70+ Examples • 50+ Hours • Beginner to Expert**
+**📈 Progressive Learning Path: 71 Examples • 50+ Hours • Beginner to Expert**
 
 ### Learning Progression
 
@@ -313,7 +314,7 @@ Module 5: Performance Tuning
 ### Software Requirements
 
 #### Operating System Support
-- **Linux** (Recommended): Ubuntu 22.04 LTS, RHEL 8/9, SLES 15 SP5
+- **Linux** (Recommended): Ubuntu 22.04/24.04 LTS, RHEL 8/9, SLES 15 SP5
 - **Windows**: Windows 10/11 with WSL2 recommended for optimal compatibility
 - **macOS**: macOS 12+ (Metal Performance Shaders for basic GPU compute)
 
@@ -322,7 +323,7 @@ Module 5: Performance Tuning
   - **Driver Requirements**: 
     - Linux: 550.54.14+ for CUDA 12.4+
     - Windows: 551.61+ for CUDA 12.4+
-- **ROCm Platform**: 6.0+ (Docker uses ROCm 6.4.3)
+- **ROCm Platform**: 7.0+ (Docker uses ROCm 7.0)
   - **Driver Requirements**: Latest AMDGPU-PRO or open-source AMDGPU drivers
   - **Kernel Support**: Linux kernel 5.4+ recommended
 
@@ -338,6 +339,8 @@ Module 5: Performance Tuning
 - **Profiling**: Nsight Compute, Nsight Systems (NVIDIA), rocprof (AMD)
 - **Debugging**: cuda-gdb, rocgdb, compute-sanitizer
 - **Libraries**: cuBLAS, cuFFT, rocBLAS, rocFFT (for advanced modules)
+- **ML Libraries**: Thrust (NVIDIA), MIOpen (AMD) for deep learning applications
+- **System Management**: NVML (NVIDIA), ROCm SMI (AMD) for hardware monitoring
 
 ### Performance Expectations by Hardware Tier
 
@@ -381,28 +384,42 @@ Experience the full development environment with zero setup:
 - 📦 Isolated and reproducible builds
 - 🧹 Easy cleanup when done
 
+**Container Specifications:**
+- **CUDA**: NVIDIA CUDA 12.9.1 on Ubuntu 22.04
+- **ROCm**: AMD ROCm 7.0 on Ubuntu 24.04 
+- **Libraries**: Production-ready toolchains with debugging support
+
 **[📖 Complete Docker Guide →](docker/README.md)**
 
 ## 🔧 Build System
 
+Our advanced build system features automatic GPU vendor detection and optimized configurations:
+
 ### Project-Wide Commands
 ```bash
-make all           # Build all modules
+make all           # Build all modules with auto-detection
 make test          # Run comprehensive tests  
 make clean         # Clean all artifacts
-make check-system  # Verify GPU setup
+make check-system  # Verify GPU setup and dependencies
 make status        # Show module completion status
 ```
 
 ### Module-Specific Commands
 ```bash
 cd modules/module1/examples
-make               # Build all examples in module
+make               # Build all examples with vendor auto-detection
 make test          # Run module tests
 make profile       # Performance profiling
 make debug         # Debug builds with extra checks
 ```
 
+### Advanced Build Features
+- **Automatic GPU Detection**: Detects NVIDIA/AMD hardware and builds accordingly
+- **Production Optimization**: `-O3`, fast math, architecture-specific optimizations
+- **Debug Support**: Full debugging symbols and validation checks
+- **Library Management**: Automatic detection of optional dependencies (NVML, MIOpen)
+- **Cross-Platform**: Single Makefile supports both CUDA and HIP builds
+
 ##  Performance Expectations
 
 | Module Level | Typical GPU Speedup | Memory Efficiency | Code Quality |
 
@@ -5,9 +5,9 @@ This directory contains Docker configurations for comprehensive GPU programming
 ## 🚀 Latest Versions (2025)
 
 - **CUDA**: 12.9.1 (Latest stable release)
-- **ROCm**: 6.4.3 (Latest stable release) 
+- **ROCm**: 7.0 (Latest stable release) 
 - **Ubuntu**: 22.04 LTS
-- **Nsight Tools**: 2025.1.1 (with fallback to 2024.6.1)
+- **Nsight Tools**: 2025.1.1
 
 ## 🚀 Quick Start
 
@@ -58,10 +58,10 @@ docker/
 
 ### CUDA Development Container
 **Image**: `gpu-programming-101:cuda`  
-**Base**: `nvidia/cuda:12.4-devel-ubuntu22.04`
+**Base**: `nvidia/cuda:12.9.1-devel-ubuntu22.04`
 
 **Features**:
-- CUDA 12.4 with development tools
+- CUDA 12.9.1 with development tools
 - NVIDIA Nsight Systems & Compute profilers
 - Python 3 with scientific libraries
 - GPU monitoring and debugging tools
@@ -73,17 +73,17 @@ docker/
 
 ### ROCm Development Container
 **Image**: `gpu-programming-101:rocm`  
-**Base**: `rocm/dev-ubuntu-22.04:6.0`
+**Base**: `rocm/dev-ubuntu-22.04:7.0-complete`
 
 **Features**:
-- ROCm 6.0 with HIP development environment
+- ROCm 7.0 with HIP development environment
 - Cross-platform GPU programming (AMD/NVIDIA)
 - ROCm profiling tools (rocprof, roctracer)
 - Python 3 with scientific libraries
 
 **GPU Requirements**:
 - AMD GPU with ROCm support (RX 580+, MI series)
-- AMD drivers with ROCm 6.0+
+- AMD drivers with ROCm 7.0+
 
 ## 🔧 Container Usage
 
@@ -251,7 +251,7 @@ NVIDIA_VISIBLE_DEVICES=all
 ROCM_PATH=/opt/rocm
 HIP_PATH=/opt/rocm/hip
 HIP_PLATFORM=amd
-HSA_OVERRIDE_GFX_VERSION=10.3.0
+HSA_OVERRIDE_GFX_VERSION=11.0.0
 ```
 
 ## 🛡️ Security Considerations
@@ -282,10 +282,10 @@ nvidia-smi  # For NVIDIA
 rocm-smi   # For AMD
 
 # Verify Docker GPU support
-docker run --rm --gpus all nvidia/cuda:12.4-base nvidia-smi
+docker run --rm --gpus all nvidia/cuda:12.9.1-base nvidia-smi
 
 # Check container runtime
-docker run --rm --device=/dev/kfd rocm/dev-ubuntu-22.04 rocminfo
+docker run --rm --device=/dev/kfd rocm/dev-ubuntu-22.04:7.0 rocminfo
 ```
 
 **"Container build fails"**
@@ -297,8 +297,8 @@ docker system prune -a
 sudo apt update && sudo apt upgrade docker-ce docker-compose
 
 # Check base image availability
-docker pull nvidia/cuda:12.4-devel-ubuntu22.04
-docker pull rocm/dev-ubuntu-22.04:6.0
+docker pull nvidia/cuda:12.9.1-devel-ubuntu22.04
+docker pull rocm/dev-ubuntu-22.04:7.0-complete
 ```
 
 **"Permission denied errors"**
 
@@ -1,6 +1,6 @@
 # GPU Programming 101 - Docker Compose Configuration
 # Supports both NVIDIA CUDA and AMD ROCm platforms
-# Updated for CUDA 12.9.1 and ROCm 6.4.3 (2025)
+# Updated for CUDA 12.9.1 and ROCm 7.0 (2025)
 
 services:
   # NVIDIA CUDA Development Environment
@@ -83,7 +83,7 @@ services:
     environment:
       - HIP_VISIBLE_DEVICES=0
       - HSA_OVERRIDE_GFX_VERSION=11.0.0
-      - ROCM_VERSION=6.4.3
+      - ROCM_VERSION=7.0
 
   # Development tools container (CPU-only for general development)
   dev-tools:
 
@@ -1,73 +1,14 @@
 # GPU Programming 101 - ROCm Development Container
-# Based on AMD's official ROCm 6.4.3 development image (latest stable as of 2025)
+# Based on AMD's official ROCm development image - used as-is for maximum compatibility
 
-FROM rocm/dev-ubuntu-22.04:6.4.3
+FROM rocm/dev-ubuntu-24.04:7.0-complete
 
 # Metadata
 LABEL maintainer="GPU Programming 101"
 LABEL description="ROCm/HIP development environment for GPU programming course"
 LABEL version="2.0"
-LABEL rocm.version="6.4.3"
-LABEL ubuntu.version="22.04"
-
-# Avoid interactive prompts during package installation
-ARG DEBIAN_FRONTEND=noninteractive
-
-# Install essential development tools for GPU programming
-RUN apt-get update && apt-get install -y \
-    # Core development tools
-    build-essential \
-    cmake \
-    git \
-    wget \
-    curl \
-    vim \
-    nano \
-    htop \
-    tree \
-    # Minimal Python for basic scripting (not data science)
-    python3 \
-    python3-pip \
-    python3-dev \
-    # Additional utilities
-    pkg-config \
-    software-properties-common \
-    # Debugging and profiling tools
-    gdb \
-    valgrind \
-    strace \
-    # Network tools
-    net-tools \
-    iputils-ping \
-    && rm -rf /var/lib/apt/lists/*
-
-# Install core ROCm development packages (keep minimal)
-RUN apt-get update && apt-get install -y \
-    # Core ROCm packages for GPU programming
-    hip-dev \
-    hip-samples \
-    hipblas-dev \
-    # ROCm profiling tools (essential for performance work)
-    rocprofiler-dev \
-    roctracer-dev \
-    && rm -rf /var/lib/apt/lists/*
-
-# Install minimal Python packages for basic development (no heavy data science libs)
-RUN pip3 install --no-cache-dir \
-    numpy \
-    matplotlib
-
-# Set up ROCm environment variables
-ENV ROCM_PATH=/opt/rocm
-ENV HIP_PATH=/opt/rocm/hip
-ENV PATH=${ROCM_PATH}/bin:${HIP_PATH}/bin:${PATH}
-ENV LD_LIBRARY_PATH=${ROCM_PATH}/lib:${HIP_PATH}/lib:${LD_LIBRARY_PATH}
-ENV HIP_PLATFORM=amd
-ENV HSA_OVERRIDE_GFX_VERSION=11.0.0
-ENV ROCM_VERSION=6.4.3
-
-# Verify HIP compiler installation (skip rocminfo as no GPU during build)
-RUN hipcc --version
+LABEL rocm.version="latest"
+LABEL ubuntu.version="24.04"
 
 # Create development workspace
 WORKDIR /workspace
@@ -76,7 +17,7 @@ RUN mkdir -p /workspace/{projects,samples,output}
 # Copy course materials (will be mounted as volume in practice)
 COPY . /workspace/gpu-programming-101/
 
-# Set up convenient aliases and environment
+# Set up convenient aliases and environment for the course
 RUN echo 'alias ll="ls -alF"' >> /root/.bashrc && \
     echo 'alias la="ls -A"' >> /root/.bashrc && \
     echo 'alias l="ls -CF"' >> /root/.bashrc && \
@@ -159,17 +100,5 @@ echo "=== All tests completed ==="\n' > /workspace/test-gpu.sh
 
 RUN chmod +x /workspace/test-gpu.sh
 
-# Install HIP samples for learning and reference
-RUN cd /workspace && \
-    if [ -d "/opt/rocm/hip/samples" ]; then \
-        cp -r /opt/rocm/hip/samples ./hip-samples; \
-    else \
-        git clone https://github.com/ROCm-Developer-Tools/HIP-Examples.git hip-examples; \
-    fi
-
 # Default command
-CMD ["/bin/bash"]
-
-# Health check to verify HIP compiler access (will only work when GPU is available)
-HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
-    CMD hipcc --version > /dev/null 2>&1 || exit 1
+CMD ["/bin/bash"]
@@ -212,7 +212,7 @@ main() {
     if [ "$pull" = true ]; then
         log "Pulling base images..."
         docker pull nvidia/cuda:12.4-devel-ubuntu22.04 || warning "Failed to pull CUDA base image"
-        docker pull rocm/dev-ubuntu-22.04:6.0 || warning "Failed to pull ROCm base image"
+        docker pull rocm/dev-ubuntu-24.04:latest || warning "Failed to pull ROCm base image"
     fi
 
     local success_count=0
 
@@ -221,7 +221,7 @@ run_rocm() {
     # Set up GPU access for AMD
     local detected_gpu=$(detect_gpu)
     if [ "$detected_gpu" = "amd" ] && [ "$no_gpu_requested" = false ]; then
-        gpu_args="--device=/dev/kfd --device=/dev/dri --security-opt seccomp=unconfined"
+        gpu_args="--device=/dev/kfd --device=/dev/dri --security-opt seccomp=unconfined --group-add video"
         log "Enabling AMD GPU access"
     elif [ "$no_gpu_requested" = true ]; then
         log "GPU access explicitly disabled with --no-gpu"
@@ -247,8 +247,6 @@ run_rocm() {
         -v "$PROJECT_ROOT:/workspace/gpu-programming-101:rw"
         -v "gpu101-rocm-home:/root"
         -w "/workspace/gpu-programming-101"
-        -e HIP_VISIBLE_DEVICES=0
-        -e HSA_OVERRIDE_GFX_VERSION=10.3.0
     )
 
     # Add port mapping