Find Your Next Job

On-Prem Cloud Engineer

Posted on May 9, 2026

  • Charlotte, United States of America
  • No Salary information.
  • Contract

On-Prem Cloud Engineer job opportunity

Tailor Your Resume for this Job


Job Duties

Build, configure, and operate on‑prem Kubernetes/OpenShift AI platforms for deploying and serving GenAI models and LLM inference workloads.

· Design and optimize high‑performance inference stacks using vLLM, TensorRT‑LLM, Triton Inference Server, SGLang, and advanced techniques (continuous batching, speculative decoding, KV caching).

· Manage GPU orchestration and capacity using Run:AI, MIG, CUDA/NCCL, and tensor parallelism to maximize utilization and throughput.

· Deploy and operate Kubernetes ML serving frameworks (KServe, Helm, Operators) for scalable, reliable model serving.

· Drive inference optimization and benchmarking, leveraging FP8, AWQ, GPTQ, and performance tools such as GuideLLM and Locust.

· Implement observability and ML monitoring using Prometheus, Grafana, Arize AI, ensuring SLA/SLO compliance for GenAI services.

· Collaborate with ML and research teams to onboard new models, tune inference performance, and productionize

Tech Skills needed

vLLM · TensorRT‑LLM · Triton Inference Server · SGLang · Inference Optimization · Continuous Batching · Speculative Decoding · KV Cache / Prefix Caching · FP8 / AWQ / GPTQ · Tensor Parallelism · Kubernetes ML Serving · KServe · OpenShift AI · Helm / Operators · GPU Orchestration · Run:AI · Performance Benchmarking · CUDA / NCCL / MIG · Prometheus / Grafana · ML Observability GuideLLM, Locust

Pay: From $45.00 per hour

Work Location: In person


Tailor Your Resume for this Job


Share with Friends!

Similar Jobs


Uber logo Uber

Senior Software Engineer - Stateful Platform

About the Role Engineering at Uber means building for real-world impact under real-world constrai…

Full Time | Aarhus, Denmark

Apply 2 hours, 29 minutes ago

Amazon Web Services logo Amazon Web Services

Practice Manager, Aws Security Assurance Services

DESCRIPTION The AWS Security Assurance Services (SAS) team, a part of Amazon Web Services, leverage…

Full Time | Toulouse, France

Apply 3 days, 2 hours ago

Amazon Web Services logo Amazon Web Services

Sr. Specialist Solutions Architect – Migration And Modernization, Ags France Specialists

DESCRIPTION Are you a customer-obsessed Solution Architect with a passion for helping customers ach…

Full Time | Bordeaux, France

Apply 3 days, 2 hours ago

Signaloid logo Signaloid

Cloud Frontend Engineer

Signaloid provides computing platforms for dramatically reducing the runtime and compute infrastruc…

Full Time | København, Denmark

Apply 3 weeks, 2 days ago

Durham College logo Durham College

Manager, Networking And Telephony

Manager, Networking and Telephony COMPETITION NO. AD26-07 Band 12: $111,288 - $139,110 About Durha…

Full Time | Oshawa, Canada

Apply 1 month ago

PointSharp logo PointSharp

Solution Engineer

Make an impact with us This is a technical presales role with a strong commercial impact, contribut…

Full Time | København, Denmark

Apply 1 month ago

PointSharp logo PointSharp

Enterprise Account Manager Iga

Make an impact with us Are you a driven sales professional with a passion for cybersecurity and new…

Full Time | København, Denmark

Apply 1 month ago

Deloitte logo Deloitte

Stagiaire Engineering

Quel sera votre rôle dans la #TeamDeloitte ? L’équipe Engineering accompagne les…

Internship | Puteaux, France

Apply 1 month, 1 week ago