Back to jobs
SB Telecom America Corp.

AI/ML Engineer, AI Infrastructure

24d ago
Location
Sunnyvale, California, US
Type
On-site · Full-time
Compensation
$150k – 250k/yr
Skills
PythonCC++PyTorchTensorFlowJAXLarge Language Models (llms)Generative AI+22
About Infrinia.ai, powered by Softbank: Softbank is making significant investments in infrastructure for AI. Through its wholly owned US subsidiary, Softbank Corp. has established Infrinia team in Silicon Valley, focused on infrastructure software for AI and AI foundations for mobile networks. Our goals are to challenge the norms and create products making use of our SOTA infrastructure (like Nvidia GB200 NVL72, MGX and DGX Grace & Hopper platforms) and cloud-native software. These products are geared towards centralized AI data centers as well as distributed AI Radio Access Network (AI RAN) data centers. We are looking for experienced practitioners who are inspired to bring innovation and build transformative products. Minimum Qualifications: • Bachelor's degree in Computer Science, Electrical Engineering, Mathematics, Statistics, or related field. • 3+ years of experience in machine learning, deep learning, and software engineering. • Proficiency in Python and experience with C/C++. • Experience with major AI/ML frameworks such as PyTorch, TensorFlow, or JAX. • Solid understanding of data structures, algorithms, and software design principles. Preferred Qualifications: • Master's or PhD in a relevant field (CS, AI/ML, etc.). • Experience with Large Language Models (LLMs), Generative AI, or Computer Vision. • Familiarity with distributed training techniques and tools (e.g., Ray, DeepSpeed, Megatron). • Experience optimizing models for GPU inference (TensorRT, Triton Inference Server). • Knowledge of MLOps practices and tools (Kubeflow, MLflow). Role: Be a key member of the AI engineering team responsible for developing and optimizing advanced AI models and workloads that run on our high-performance GPU systems. You will leverage our SOTA infrastructure to train, fine-tune, and serve large-scale models. Drive innovation in model architecture and training efficiency to maximize performance and resource utilization. Work closely with infrastructure engineers, product management, and researchers to bridge the gap between hardware capabilities and AI application requirements. Responsibilities: • Design, implement, and train state-of-the-art machine learning models for various applications (e.g., NLP, Computer Vision, Network Optimization). • Optimize AI workloads for performance and scalability on large-scale GPU clusters such as GB200 NVL72 with Dynamo, vLLM etc. • Collaborate with the team to co-design software and hardware solutions for efficient AI processing. • Develop tools and pipelines for data processing, model evaluation, and deployment. • Stay up-to-date with the latest advancements in AI research and technology. • Contribute to Product Definition (PRD) and program execution (sprint) planning. • Role model and foster a culture of humility and innovation for product delivery. Salary: The base salary for this position ranges from ($150,000-$250,000), with additional attractive biannual bonus and benefits.