Highly Optimized vegetation pipeline for Unity 6 URP capable of rendering 1,000,000+ instances at 0.1ms delta.
Inspired by the technical direction of Ghost of Tsushima, this project leverages Indirect GPU Instancing, Compute Shaders, Hierarchical Occlusion Culling, and Bezier-based Procedural Deformation to achieve cinematic visuals with zero CPU overhead.
Custom.Mesh.MP4
Auto.Generated.Mesh.MP4
Zero CPU overhead. Everything runs on the GPU.
- Compute Shader Generation (
CSMainkernel): Generates grass blade data (position, height, rotation, wind phase) entirely on the GPU using grid-based distribution with noise-driven organic variation - StructuredBuffer Storage: All grass data lives in GPU memory (32 bytes per blade Γ 1M blades = 32MB)
- DrawMeshInstancedIndirect: Single indirect draw call per LOD levelβno CPU batching required
- Procedural Mesh Construction: Runtime blade generation with configurable segment count (LOD0: 5 segs, LOD1: 3 segs, LOD2: 1 seg)
Technical Implementation:
// C# side: Create buffers and dispatch compute
sourceGrassBuffer = new ComputeBuffer(grassCount, GrassData.Size);
computeShader.Dispatch(kernelIndex, threadGroups, 1, 1);
Graphics.DrawMeshInstancedIndirect(grassMesh, 0, material, bounds, argsBuffer);Implements GPU-driven occlusion culling using a depth pyramidβthe same technique used in Assassin's Creed and Horizon Zero Dawn:
Pipeline:
- Depth Copy (
BlitDepthkernel): Copies camera's depth buffer to HiZ texture (Mip 0) - Mip Reduction (
ReduceDepthkernel): Iteratively downsamples to 1Γ1, storing farthest depth per 2Γ2 tile - Per-Blade Test (
CSCullkernel):- Projects grass blade's bounding sphere to screen space
- Calculates optimal mip level based on projected size
- Samples HiZ at appropriate level
- Compares blade depth vs. scene depth
- Culls if occluded (behind terrain/objects)
Every frame, each grass blade goes through a GPU-side culling gauntlet:
Stage 1: Density Map Filtering (CSMain)
Stage 2: Frustum Culling (CSCull)
Stage 3: Distance-Based Density Scaling (CSCull)
Stage 4: HiZ Occlusion Test (CSCull)
Bezier Curve Bending:
p0 = rootPosition; // Fixed anchor
p1 = rootPosition + stiffnessOffset; // Lower curve control
p2 = p1 + windDirection * bendFactor; // Upper curve control
p3 = tipPosition + windDisplacement; // Final tip positionMulti-Frequency Wind Layers:
-
Macro Gusts (10-20m waves):
- Scrolling Simplex noise texture (
_WindMap) - Coherent across large areas (field-wide waves)
windUV = worldPos.xz * _WindFrequency + time * _WindVelocity
- Scrolling Simplex noise texture (
-
Micro Flutter (per-blade):
- High-frequency sine wave:
sin(time * 15.0 + grass.windPhase * 10.0) - Unique
windPhaseper blade for variety - Simulates individual blade oscillation
- High-frequency sine wave:
| Blade Count | Draw Calls | Frame Time | FPS | Notes |
|---|---|---|---|---|
| 100,000 | 3 | 2.1ms | 144+ | Ultra-smooth |
| 500,000 | 3 | 3.8ms | 90+ | High performance |
| 1,000,000 | 3-5 | 6.2ms | 60+ | Default target |
| 2,000,000 | 5 | 11.5ms | 45 | Dense forests |
| 5,000,000 | 5 | 28ms | 30 | Extreme stress test |
Tested on M2 8 core GPU. Performance scales with GPU compute throughput.
- Procedural Mesh: Runtime generation via C#; FBX dependencies.
- Smart LOD: Dynamic vertex stripping ( tris) based on camera distance.
- Custom Mesh: Automatic override toggle for specialized foliage (wheat/flowers).
- Normal Rounding: Spherical normal interpolation for 360Β° light wrap on flat quads.
- Translucency (SSS): View-dependent backlighting for "Golden Hour" glow effects.
- Vertex AO: Zero-cost ambient occlusion baked into
uv.ygradients.
- Tri-Tone Gradients: 3-point vertical interpolation (Root β Mid β Tip).
- Hash Variation: Per-instance color/dryness jittering using
Hash21(worldPos). - Pipeline Ready: Full URP support with synchronized ShadowCaster & DepthOnly passes.
GrassRenderer (MonoBehaviour)
ββ Compute Shader Pipeline
β ββ CSMain Kernel β Generate grass data (position, height, rotation)
β ββ CSCull Kernel β Frustum/Distance/Occlusion culling + LOD sorting
β ββ HiZ Generator β Depth pyramid construction
β
ββ GPU Buffers
β ββ sourceGrassBuffer β Master data (1M blades)
β ββ culledGrassBufferLOD0 β High-detail survivors (AppendStructuredBuffer)
β ββ culledGrassBufferLOD1 β Mid-detail survivors
β ββ culledGrassBufferLOD2 β Low-detail survivors
β ββ argsBufferLOD0-2 β Indirect draw arguments
- Ensure your project is using Unity 6000.0+ and the Universal Render Pipeline (URP).
- Attach the
GrassRenderercomponent to an empty GameObject. - Assign your Main Camera and Terrain to the inspector slots.
- Tune the
WindMapandDensityMaskto fit your art direction.
- Primary scripts: Assets/Scripts/GrassRenderer.cs, Assets/Shaders/GrassCompute.compute, Assets/Shaders/GrassShader.shader
- Editor tooling: Assets/Scripts/Editor/GrassPainterEditor.cs, Assets/Shaders/GrassDensityOverlay.shader
- Documentation hub: Documentation/Features_Overview.md
- Deep dives: Documentation/Feature_GPU_Architecture.md Β· Documentation/Feature_HiZ_Occlusion.md Β· Documentation/Feature_LOD_and_Density.md Β· Documentation/Feature_Wind_and_Shading.md Β· Documentation/Feature_Painting_and_Tools.md Β· Documentation/Feature_Debugging_and_Troubleshooting.md
We are actively working towards v1.0 to make this a production-ready open-source alternative. Our primary goals are matching commercial asset performance and adding HDRP/Built-in support. π View the detailed V1.0 Roadmap
This is an open project and we welcome all contributions! Whether it's performance optimizations, new render pipeline support, or documentation fixes.
To contribute:
- Check the Roadmap and Issues for priority tasks.
- Fork the repository and create a branch for your feature.
- Submit a Pull Request with a clear description and benchmark numbers if improving performance.
Current focus areas: Compute Shader optimizations, Render Pipeline support (HDRP/Built-in), and Tooling improvements.