Job Title: AI/ML Engineer – LLM Integration & Model Operations
Location: Remote - USA
Employment Type: Full-time/Permanent
"CANDIDATE WHO ARE APPLLYING SHOULD BE LEGALLY AUTHORIZED TO WORK IN THE U.S. WITHOUT SPONSORSHIP"
About the Role
We are seeking an experienced AI/ML Engineer to help design, deploy, and optimize large language model (LLM) systems in production. You’ll work on model integration, scalable AI microservices, and evaluation frameworks — ensuring performance, security, and compliance across all deployments.
This role is ideal for someone who enjoys building real-world AI applications, collaborating across teams, and innovating in a fast-paced environment.
Key Responsibilities
• Deploy and optimize LLMs, embeddings, and RAG pipelines in production.
• Build scalable Python-based AI microservices for question answering, document understanding, and reasoning.
• Implement evaluation and monitoring for model precision, recall, latency, and drift.
• Develop prompt templates, context windows, and fine-tuning strategies tailored to business needs.
• Collaborate with data scientists to design auditable data pipelines from raw input to model-ready datasets.
• Ensure security and compliance (SOC2/HIPAA) in all AI deployments.
• Partner with product, compliance, and engineering teams to deliver reliable, innovative AI features.
Qualifications
Required:
• 4+ years of experience in applied AI/ML engineering, with proven production deployment experience.
• Strong skills in Python, FastAPI/Flask, and frameworks such as Transformers, LangChain, PyTorch, or TensorFlow.
• Hands-on experience with LLM APIs, embeddings, retrieval systems, or fine-tuning models.
• Experience deploying on AWS, Azure, or GCP.
• Knowledge of vector databases (Pinecone, Weaviate, FAISS).
• Ability to implement structured evaluation and monitoring systems.
• Comfortable working in fast-paced, startup-style environments.
Preferred:
• Background in insurance or financial services data.
• Experience with orchestration tools such as LangGraph, Airflow, or Prefect.
• Familiarity with secure deployment patterns (VPC, single-tenant isolation, audit logging).
Job Type: Full-time
Pay: From $120,000.00 per year
Application Question(s):
• How many years of experience do you have in applied AI/ML engineering?
• Have you deployed large language models (LLMs) or RAG pipelines into production environments?
• Describe your experience with Python and frameworks such as FastAPI or Flask.
• Which ML libraries or frameworks have you used extensively (e.g., Transformers, LangChain, PyTorch, TensorFlow)?
• Have you integrated or fine-tuned models using LLM APIs (e.g., OpenAI, Anthropic, or Hugging Face)?
• What cloud platforms have you used for deploying AI workloads (AWS, Azure, GCP)?
• Have you worked with vector databases like Pinecone, Weaviate, or FAISS?
• Can you describe how you’ve implemented evaluation or monitoring frameworks for AI model performance?
• Have you set up alerting or drift detection systems for model degradation?
• Are you familiar with SOC2, HIPAA, or similar data security and compliance standards?
• Have you worked in the insurance or financial services domain before?
Work Location: Remote