Job Description
InferenceOps AI is scaling LLM inference infrastructure and needs an engineer to optimize GPU clusters and serving pipelines.
Key Skills
CUDA
vLLM
Triton
Kubernetes
Requirements
- 4+ years infra engineering
- GPU cluster management
- LLM serving frameworks
- Kubernetes at scale
Benefits
- GPU access
- Top comp
- Research time
- Premium health