OptimizationJob CRD for Hyperparameter Optimization in Kubeflow Katib
Aniket Shaha
This project addresses limitations in Kubeflow Katib current Experiment CRD, which uses a generic and loosely typed interface for hyperparameter...
Project 5: Helm Charts for Kubeflow Pipelines and Katib — Danish Siddiqui
danish9039
KFP and Katib users deploying via Helm currently have no upstream-supported path — charts exist but drift from Kustomize baselines silently, with no...
Project 6: MCP Server for Kubeflow SDK
Krishna-kg732
The Kubeflow SDK gives AI practitioners a clean Python interface to submit, monitor, and manage distributed training jobs on Kubernetes via...
Agentic RAG on Kubeflow — Multi-Index Retrieval with Kagent & MCP
Rohit Kumar (Kmrrohit)
Kubeflow's documentation, GitHub issues, and platform code are spread across dozens of repositories with no unified search. This project evolves...
Kubeflow SDK/SparkClient - Batch Jobs, Observability & Production Readiness
Sameer_Yadav
The current Kubeflow SparkClient (KEP-107) provides a solid foundation for running interactive Spark workloads on Kubernetes, but it is still missing...
Platform Scalability and Security
siddhant_jainn
Kubeflow's adoption at enterprise scale exposes critical bottlenecks in controller efficiency, security posture, and operational overhead. This...
End-to-End ARM64 Support & Validation on Kubeflow
Syed Mohd Maaz
ARM64 is becoming increasingly common with the rise of Apple Silicon and cloud instances like AWS Graviton, but Kubeflow still doesn’t run...
Project 11 - Composable Kale Notebooks with Visual Pipeline Editor
Yash_vrd
Kale compiles a single Jupyter notebook into a Kubeflow Pipeline by parsing cell tags, detecting data dependencies with PyFlakes, and generating KFP...
Dynamic LLM Trainer Framework for Kubeflow — TRL Backend with Pluggable Multi-Framework Support
Yassin Hashem
Kubeflow Trainer V2 currently supports only TorchTune as its LLM fine-tuning backend. TorchTune stopped adding new features in July 2025, leaving...