AI Cloud Engineer (Full-Time, Remote US) at TRSA | Association for Linen, Uniform and Facility Services Industry

Back to jobs
TRSA | Association for Linen, Uniform and Facility Services Industry

AI Cloud Engineer (Full-Time, Remote US)

15d ago
Location
New York, New York, US
Type
Hybrid · Full-time
Compensation
$140k – 195k/yr
Skills
AWSGCPAzureKubernetesHelmTerraformPulumiCloudformation+22
Company: LockedIn AI Location: Remote (US-Based) • Optional Hybrid (New York, NY) Reports To: Co-Founder / CEO Compensation: $140,000 – $195,000 USD per year About LockedIn AI LockedIn AI is a fast-growing AI-native platform trusted by over one million users worldwide. We build real-time AI tools that help candidates succeed in job interviews, coding assessments, and professional meetings. Our core product delivers live AI assistance during interviews and assessments—helping users communicate clearly, think faster, and perform at their best in high-pressure situations. We are now scaling our infrastructure to support the next generation of AI-powered real-time systems. Role Overview We are looking for a cloud-native, AI-infrastructure-focused AI Cloud Engineer to design and operate the cloud systems that power our machine learning and real-time AI products. This Role Sits At The Intersection Of Cloud Engineering, DevOps, And AI Systems Architecture. You Will Own The Infrastructure Layer That Supports: Model training and fine-tuning pipelines Real-time LLM inference systems GPU-based distributed compute environments High-scale production AI services for 1M+ users You will be responsible for building highly scalable, cost-efficient, and low-latency cloud infrastructure optimized specifically for AI workloads. Key Responsibilities • AI Cloud Architecture Design cloud-native infrastructure for AI/ML workloads Build GPU-based compute environments for training and inference Architect multi-stage environments (training, staging, production) Optimize AWS / GCP / Azure infrastructure for AI performance and scale • Model Serving & Inference Systems Build and maintain low-latency inference pipelines for LLMs and AI services Deploy model serving frameworks (vLLM, Triton, TensorRT, TGI, etc.) Optimize throughput, batching, caching, and GPU utilization Design failover, load balancing, and high-availability systems • GPU Infrastructure & Distributed Training Manage GPU clusters for training and fine-tuning large models Implement distributed training pipelines (multi-node, multi-GPU) Optimize compute scheduling, spot instances, and resource efficiency Support managed AI platforms (SageMaker, Vertex AI, Azure ML) • Cost Optimization (FinOps for AI) Monitor and reduce cloud costs across compute, storage, and APIs Implement GPU cost optimization strategies (spot, reserved, autoscaling) Build dashboards for cost-per-inference and cost-per-training-job Optimize LLM usage, caching, and routing strategies • Security & Networking Design secure VPC architectures for AI systems Implement IAM policies, encryption, and secrets management Ensure compliance readiness (SOC2, GDPR, CCPA) Secure model weights, embeddings, and AI APIs • Infrastructure Automation & Observability Build Infrastructure as Code (Terraform / Pulumi / CloudFormation) Automate deployment of training and inference environments Implement monitoring for GPU health, latency, and system performance Build alerting systems for failures and performance degradation Experience Required Qualifications 3+ years in cloud engineering, DevOps, or infrastructure roles Experience with ML/AI workloads in production environments Hands-on GPU infrastructure or AI system deployment experience Strong understanding of distributed systems and cloud architecture Experience in fast-paced startup or scale-up environments Technical Skills Cloud platforms: AWS, GCP, or Azure (strong proficiency required) Kubernetes (GPU scheduling, autoscaling, Helm, clusters) Infrastructure as Code (Terraform / Pulumi / CloudFormation) Python, Go, or Bash for automation and tooling AI serving systems (vLLM, Triton, TensorRT, TGI, etc.) Monitoring tools (Prometheus, Grafana, Datadog, CloudWatch) Preferred Qualifications Experience with large-scale LLM inference systems Distributed training (multi-node GPU clusters, NCCL, parallelism) Streaming or real-time systems (WebSockets, low-latency APIs) RDMA / InfiniBand or high-performance networking Experience in SaaS, EdTech, or consumer AI products Open-source contributions in cloud or AI infrastructure What We Offer Equity Ownership Meaningful early-stage equity in a fast-scaling AI company High Impact Your infrastructure directly powers 1M+ global users Cutting-Edge AI Systems Work on real-time AI products at production scale Remote Flexibility Work from anywhere in the US (optional NYC hybrid) Fast Growth Environment High autonomy, fast execution, and meaningful ownership AI-Native Culture Work with modern AI systems, not legacy infrastructure Why Join Us At LockedIn AI, you won’t just maintain infrastructure—you’ll build the backbone of real-time AI systems used in live interviews and professional environments worldwide. This is a rare opportunity to shape how AI infrastructure scales at consumer level. How To Apply Please submit: Resume / CV Short note explaining why you want to join LockedIn AI Optional: GitHub, portfolio, or technical writing Requirements Required Qualifications Experience 3+ years in cloud engineering, DevOps, or infrastructure roles Experience with ML/AI workloads in production environments Hands-on GPU infrastructure or AI system deployment experience Strong understanding of distributed systems and cloud architecture Experience in fast-paced startup or scale-up environments Technical Skills Cloud platforms: AWS, GCP, or Azure (strong proficiency required) Kubernetes (GPU scheduling, autoscaling, Helm, clusters) Infrastructure as Code (Terraform / Pulumi / CloudFormation) Python, Go, or Bash for automation and tooling AI serving systems (vLLM, Triton, TensorRT, TGI, etc.) Monitoring tools (Prometheus, Grafana, Datadog, CloudWatch) Preferred Qualifications Experience with large-scale LLM inference systems Distributed training (multi-node GPU clusters, NCCL, parallelism) Streaming or real-time systems (WebSockets, low-latency APIs) RDMA / InfiniBand or high-performance networking Experience in SaaS, EdTech, or consumer AI products Open-source contributions in cloud or AI infrastructure Location Address Manhattan, KS, USA New York 10003 United States Apply Now Apply Now Job Category Information Technology (IT) Job Type Full Time Market Sector Facility Services Education (*Required Level) Some College Years of Experience 2-4 years Level Experienced Telecommuting Allowed? No Number of Openings 2 Salary $140,000 – $195,000 USD per year
AI Cloud Engineer (Full-Time, Remote US) at TRSA | Association for Linen, Uniform and Facility Services Industry | AI Career Hub