Machine Learning Operations Engineer

Fathom.io


Date: 1 week ago
City: Dhahran
Contract type: Full time
About The Role

We are a pioneering AI/DataOps company, marking our footprint on the global stage with a presence in Saudi Arabia, Poland, and Norway. As a pre-series A startup, we are proudly backed by one of the world's leading corporations, underscoring our potential and the innovative spirit driving our mission. Our platform is engineered to address complex business challenges through cutting-edge AI solutions, and we are on the brink of launching a product set to revolutionize the industry.

Role Overview

As our first MLOps Engineer, you will play a critical role in shaping the infrastructure and processes for deploying, monitoring, and scaling machine learning models. You'll work closely with our data science, engineering, and DevOps teams to build a robust ML pipeline and ensure seamless model deployment and management.

Responsibilities

  • Design, build, and maintain end-to-end ML pipelines, including data processing, model training, evaluation, and deployment.
  • Automate model deployment and lifecycle management across cloud and potential on-prem environments.
  • Establish CI/CD workflows for ML models, ensuring reproducibility and traceability.
  • Implement monitoring, logging, and alerting for model performance and drift detection.
  • Optimize ML training and inference workloads for cost and performance efficiency.
  • Collaborate with DevOps and engineering teams to integrate ML workloads with broader infrastructure.
  • Define and implement MLOps best practices, including experiment tracking, versioning, and governance.
  • Evaluate and recommend tools and frameworks for MLOps, considering both cloud and on-prem scenarios.

Requirements

  • 2-7+ years of experience in MLOps, DevOps, or related fields with a strong AI/ML focus.
  • Hands-on experience with cloud platforms (GCP preferred) and container orchestration (Kubernetes, Docker).
  • Proficiency in AI/ML pipeline frameworks (Kubeflow, MLflow, TFX, or similar).
  • Strong knowledge of CI/CD tools (GitHub Actions, ArgoCD, or similar) for ML models.
  • Experience with monitoring AI/ML models in production.
  • Strong programming skills in Python, Bash, or Go.
  • Familiarity with model serving frameworks (TF Serving, Triton, BentoML) and decentralized / distributed computing (Ray, Spark).
  • Experience in optimizing AI/ML workloads for GPUs and CPUs.
  • Excellent problem-solving skills and ability to work in a fast-paced, evolving environment.

Nice to Have

  • Experience with hybrid cloud/on-prem deployments.
  • Experience in infrastructure-as-code (Terraform, Pulumi).
  • Prior startup experience or working in an environment with evolving ML infrastructure.

Why Join Us?

  • Opportunity to be the first MLOps hire and define the future of ML infrastructure at Fathom.
  • Work on cutting-edge AI/ML challenges with a team that values innovation and impact.

How to apply

To apply for this job you need to authorize on our website. If you don't have an account yet, please register.

Post a resume

Similar jobs

Technical Support Specialist - Real Time Systems

Lufkin Gears, Dhahran
6 days ago
Do you like working in collaborative teams and solving technical problems?Do you enjoy technical challenges?Be part of a successful teamOur Oilfield Services business provides intelligent, connected technologies to monitor and control our energy extraction assets. Our team provide technical expertise to meet our client expectation. We provide customers with the peace of mind needed to reliably and efficiently improve their...

SENIOR IT COMPUTING INFRASTRUCTURE SPECIALIST.

Johns Hopkins Aramco Healthcare (JHAH), Dhahran
1 week ago
ob Description SummaryThe Senior IT Computing Infrastructure Specialist is responsible for implementing and maintaining the organization’s physical and virtual computing infrastructure, utilizing their expertise in several of the relevant technologies, to ensure optimal reliability and performance of supported IT solutions at JHAH.Operational Roles & ResponsibilitiesImplement configure maintain support and monitor the organizations IT computing infrastructure including servers' storage virtualization platforms...

Whipstock and Fishing Field Supervisor

Baker Hughes, Dhahran
1 week ago
Job RequirementsWhipstock and Fishing Field Supervisor Do you enjoy problem-solving and implementing solutions?Do you enjoy being part of a successful completion and wellbore intervention team?Join our Completions & Wellbore Intervention TeamWe're leader in well construction and production. Our innovative physical and digital solutions improve efficiency, production and maximize reservoir value. Our highly experienced Technical Support Team provide pre-job planning, delivery...