AI Computing Infrastructure Engineer – GPU & High-Performance Computing
NETS-International Group
Date: 2 weeks ago
City: Riyadh
Contract type: Contractor

Riyadh, Saudi Arabia
contractual
Company Description
NETS is a leading global Solutions Provider and Systems Integrator dedicated empowering the future through our integrated approach and commitment to delivering Innovative, Intelligent, and Integrated Solutions (NETS 3 I’s) Effectively, Efficiently, and Economically (NETS 3 E’s). Our service portfolio covers 3 verticals namely Infrastructure, Digital, and Managed Solutions, and NETS Services include Access Networks (Fixed and Wireless), Enterprise Data Networks, Cloud Solutions, Cyber Security, Automation, Resource Outsourcing, and Managed Services. NETS brings over 4 decades of proven domain expertise, service specialization, and industry leadership, delivering over 3,000+ successful projects. Our 1,000+ highly skilled & professional staff, collaboration with over 50 leading global technology partners, 100+ NETS OEM Partners, and NETS Reach, with offices in the UK, UAE, USA, Saudi Arabia, and Pakistan, has allowed us to be the preferred trusted partner to over 200 long-standing satisfied customers including fortune 500 companies across 25+ countries.
Job Description
AI Computing Infrastructure Engineer – GPU & High-Performance Computing
Role Overview
We are looking for a highly capable AI Infrastructure Engineer to design, implement, and optimize GPU-accelerated compute environments that power advanced AI and machine learning workloads. This role is critical in building and supporting scalable, high-performance infrastructure across data centers and hybrid cloud platforms, enabling training, fine-tuning, and inference of modern AI models
Key Responsibilities
Requirements
Required Skills & Qualifications
contractual
Company Description
NETS is a leading global Solutions Provider and Systems Integrator dedicated empowering the future through our integrated approach and commitment to delivering Innovative, Intelligent, and Integrated Solutions (NETS 3 I’s) Effectively, Efficiently, and Economically (NETS 3 E’s). Our service portfolio covers 3 verticals namely Infrastructure, Digital, and Managed Solutions, and NETS Services include Access Networks (Fixed and Wireless), Enterprise Data Networks, Cloud Solutions, Cyber Security, Automation, Resource Outsourcing, and Managed Services. NETS brings over 4 decades of proven domain expertise, service specialization, and industry leadership, delivering over 3,000+ successful projects. Our 1,000+ highly skilled & professional staff, collaboration with over 50 leading global technology partners, 100+ NETS OEM Partners, and NETS Reach, with offices in the UK, UAE, USA, Saudi Arabia, and Pakistan, has allowed us to be the preferred trusted partner to over 200 long-standing satisfied customers including fortune 500 companies across 25+ countries.
Job Description
AI Computing Infrastructure Engineer – GPU & High-Performance Computing
Role Overview
We are looking for a highly capable AI Infrastructure Engineer to design, implement, and optimize GPU-accelerated compute environments that power advanced AI and machine learning workloads. This role is critical in building and supporting scalable, high-performance infrastructure across data centers and hybrid cloud platforms, enabling training, fine-tuning, and inference of modern AI models
Key Responsibilities
- AI Infrastructure Design & Deployment with multi-GPU clusters using NVIDIA or AMD platforms.
- Configure GPU environments using CUDA, DGX Systems, and NVIDIA Kubernetes Device Plugin.
- Deploy and manage containerized environments with Docker, Kubernetes, and Slurm.
- AI Model Support & Optimization for training, fine-tuning, and inference pipelines for LLMs and deep learning models.
- Enable distributed training using DDP, FSDP, and ZeRO, with support for mixed precision.
- Tune infrastructure to optimize model performance, throughput, and GPU utilization.
- Design and operate high-bandwidth, low-latency networks using InfiniBand and RoCE v2.
- Integrate GPUDirect Storage and optimize data flow across Lustre, BeeGFS, and Ceph/S3.
- Support fast data ingestion, ETL pipelines, and large-scale data staging.
- Leverage NVIDIA’s AI stack including cuDNN, NCCL, TensorRT, and Triton Inference Server.
Requirements
Required Skills & Qualifications
- Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.
- 3–6 years of experience in AI/ML infrastructure engineering or high-performance computing (HPC).
- Solid experience with GPU-based systems, container orchestration, and AI/ML frameworks.
- Familiarity with distributed systems, performance tuning, and large-scale deployments.
- Expertise in modern GPU architectures (e.g., NVIDIA A100/H100, AMD MI300), multi-GPU configurations (NVLink, PCIe, HBM), and accelerator scheduling for AI training and inference workloads.
- Good understanding of modern AI model architectures, including LLMs (e.g., GPT, LLaMA), diffusion models, and multimodal encoder-decoder frameworks, with awareness of their compute and scaling requirements.
- Knowledge of leading AI/ML frameworks (e.g., TensorFlow, PyTorch), NVIDIA’s AI stack (CUDA, cuDNN, TensorRT), and open-source tools like Hugging Face, ONNX, and MLPerf for model development and benchmarking.
- Familiarity with AI pipelines for supervised/unsupervised training, fine-tuning (PEFT/LoRA/QLoRA), and batch or real-time inference, with expertise in distributed training, checkpointing, gradient strategies, and mixed precision optimization.
- NVIDIA Certified Professional – Data Center AI
- Kubernetes Administrator (CKA)
- CCNP or CCIE Data Center
- Cloud Certification (AWS, Azure, or GCP)
How to apply
To apply for this job you need to authorize on our website. If you don't have an account yet, please register.
Post a resumeSimilar jobs
OCI Manager Account Cloud Engineering
Oracle,
Riyadh
17 hours ago
Job DescriptionProvides direction, leadership, and specialist knowledge to Cloud Solution team in designing, demonstrating and deploying Oracle Cloud architectures that address customer business problems. Guides team to drive Oracle Cloud customer consumption through accelerating the adoption of Oracle cloud services.ResponsibilitiesResponsible for managing resources that work with customers, sales, engineering, and product teams to design and implement cloud solutions for customers....

Co-op Training Program
Richemont,
Riyadh
21 hours ago
Richemont Group - Co-op Training Program: Launch Your Career with a World Leader.Are you a driven and ambitious Saudi National looking to gain real-world experience with a leading luxury goods company? The Richemont Group is offering an exciting Co-op Training Program designed to provide university students with invaluable hands-on experience, build essential skills, and prepare you for a successful career....

مشرف/ة خدمة عملاء
Tamakun | تمكن,
Riyadh
22 hours ago
إعلان وظيفي – مشرف/ـة خدمة عملاءتعلن شركة تمكن للاستثمار – ومقرها الرياض – عن توفر فرصة وظيفية مميزة للانضمام إلى فريق العمل في قسم خدمة العملاء.المسمى الوظيفي:مشرف/ـة خدمة عملاء – Customer Service Supervisor المهام والمسؤوليات:الإشراف على فريق ممثلي خدمة العملاء ومتابعة الأداء اليومي.تطوير الإجراءات والسياسات المتعلقة بخدمة العملاء لضمان تقديم تجربة احترافية.معالجة الشكاوى والاقتراحات والتعامل مع الحالات التصعيدية باحترافية.إعداد التقارير الدورية وتحليل مؤشرات الأداء KPIs للقسم.المساهمة...
