* [NVIDIA AI Cluster Runtime v0.14.0](https://github.com/NVIDIA/aicr) – Tooling for optimized, validated, reproducible GPU-accelerated Kubernetes clusters via version-locked recipes and deployment-ready bundles.

* [GoGPU v0.41.0](https://github.com/gogpu/gogpu) – Pure Go GPU computing ecosystem offering WebGPU-compatible APIs with selectable Rust or native backends and zero CGO.

* [RapidFire AI v0.16.0](https://github.com/RapidFireAI/rapidfireai) – Hyperparallelized, shard-based experiment framework for rapid LLM customization, RAG, context engineering, and fine-tuning with real-time control.

* [wgpu v0.29.0](https://github.com/gogpu/wgpu) – Pure Go WebGPU implementation providing W3C-compliant API and multiple hardware backends without Rust or CGO.

* [TypeGPU v0.11.8](https://github.com/software-mansion/TypeGPU) – TypeScript library enhancing the WebGPU API for type-safe resource management.

* [LLMKube llmkube-0.8.0](https://github.com/defilantech/LLMKube) – Kubernetes operator managing self-hosted LLM inference on NVIDIA GPUs and Apple Silicon, with autoscaling, model routing, and OpenAI-compatible API.

* [Tensor Fusion v1.64.1](https://github.com/NexusGPU/tensor-fusion) – State-of-the-art GPU virtualization and pooling solution that optimizes GPU cluster utilization.

* [GPUd v0.11.7](https://github.com/leptonai/gpud) – GPU-focused monitoring and diagnostics tool that detects GPU and fabric errors and reports critical system metrics.

* [Grove v0.1.0-alpha.9](https://github.com/ai-dynamo/grove) – Kubernetes API providing a single declarative interface to orchestrate multi-node AI inference with topology-aware placement, hierarchical gang scheduling, and autoscaling.