Infrastructure Subject Matter Expert for BCDR & DR Automation - KSA
DeepSource Technologies
Role Overview:
The Infrastructure SME plays a critical role in ensuring that the underlying IT infrastructure fully supports Business Continuity and Disaster Recovery (BCDR) objectives, with a strong focus on DR Automation. This role bridges infrastructure and automation teams, ensuring resilience, scalability, and seamless failover/failback execution across all infrastructure layers.
Key Responsibilities
1. Infrastructure Architecture & Readiness
· Review and validate the end-to-end infrastructure architecture supporting automated DR failover and failback, including:
o Network, Security, Compute, storage, virtualization, containers, and data center components
· Ensure the design supports high availability, resiliency, and recoverability aligned with business requirements.
2. DR Automation Integration
· Act as the primary bridge between infrastructure teams and the DR Automation team, ensuring alignment and seamless collaboration.
· Review and validate automated failover/failback workflows across infrastructure components, including:
o Network, Security, Servers, storage, DNS, virtualization platforms, and container environments
· Collaborate on the development of pre-failover validation scripts to ensure readiness before execution.
3. Recovery Objectives & Capacity Planning
· Review and validate infrastructure-level RTOs, ensuring alignment with application and business recovery requirements.
· Ensure sufficient capacity and performance within DR sites and automation platforms to support:
o Full failover scenarios
o Partial or phased failover scenarios
4. Technical Leadership & Engagement
· Lead and actively participate in technical discussions and workshops across:
o Discovery
o Validation
o Tabletop exercises
· Provide domain expertise and recommendations to ensure robust infrastructure design and DR strategy alignment.
5. Performance & Validation
· Oversee and validate infrastructure performance testing during and after DR failover/failback activities.
· Ensure that systems meet defined performance benchmarks and recovery objectives post-recovery.
6. Compliance & Audit Readiness
· Review and ensure adherence to audit and regulatory requirements, particularly around:
o Logging
o Monitoring
o Traceability of DR activities
· Support audit readiness by ensuring proper documentation and controls are in place.
7. Cross-Functional Collaboration
· Collaborate with Application, Network, Security, Database, and Business teams to ensure end-to-end alignment.
· Coordinate with stakeholders to ensure dependencies are properly managed across infrastructure and application layers.
8. Continuous Improvement & Optimization
· Identify opportunities to optimize infrastructure resilience, performance, and cost efficiency.
· Drive continuous improvement initiatives based on test results, incidents, and evolving business needs.
Requirements
· Strong expertise in enterprise infrastructure design and operations (network, compute, storage, virtualization, cloud).
· Hands-on experience with Disaster Recovery architectures and DR automation tools.
· Deep understanding of failover/failback mechanisms and infrastructure dependencies.
· Experience in capacity planning, performance testing, and high availability design.
· Knowledge of regulatory and compliance requirements related to DR and infrastructure.
· Strong stakeholder communication and cross-team coordination skills.
Preferred Qualifications
· Experience in large-scale BCDR and DR Automation programs.
· Certifications in infrastructure technologies, cloud platforms, or DR/BCDR frameworks.
How to apply
To apply for this job you need to authorize on our website. If you don't have an account yet, please register.
Post a resumeSimilar jobs
Customer Experience Specialist
Medical Representative
Sr. Delivery Consultant Security, Professional Services