Application SME (BCDR & DR Automation) - KSA
DeepSource Technologies
Role Overview:
The Application SME is responsible for ensuring application and database readiness for Business Continuity and Disaster Recovery (BCDR), with a strong focus on automation. The role drives application-level DR strategy, execution, and documentation, while acting as the key interface between application teams and the DR Automation function to ensure seamless, reliable, and fully orchestrated failover and failback.
Key Responsibilities
1. Application DR Strategy & Planning
· Own the application and database DR strategy, including failover and failback planning.
· Define application-specific DR requirements, including:
o Recovery Time Objectives (RTO)
o Startup/shutdown sequencing
o Data integrity and consistency requirements
· Continuously review and optimize application-layer RTOs through automation capabilities.
2. DR Automation Integration
· Collaborate with application owners, system administrators, and DBAs to design automated failover processes across applications and databases (all DB technologies).
· Ensure end-to-end automation of application and database recovery tasks, eliminating manual intervention during both planned and unplanned DR events.
· Act as the primary bridge between application teams and the DR Automation team, ensuring alignment and seamless integration.
· Review and validate automated failover/failback workflows to ensure accuracy, sequencing, and dependency handling.
· Drive development and validation of data collection templates to ensure all required DR inputs are captured and validated during workshops.
3. Dependency Management & Runbook Alignment
· Ensure accurate mapping of application, database, and infrastructure dependencies, including inter-application relationships.
· Validate that startup/shutdown sequences and dependency mapping are correctly reflected in automation workflows and runbooks.
· Regularly review Application and Database sections of DR Plans (DRPs) to ensure alignment with automation design and current environment state.
4. Technical Leadership & Workshops
· Lead and govern technical discussions and workshops across:
o Discovery
o Validation
o Tabletop exercises
· Provide expert guidance to ensure robust application DR design and automation readiness.
5. Testing, Execution & Validation
· Support and validate DR testing activities, including control tests, dry runs, and full DR simulations.
· Ensure application and database recovery meets defined RTO, RPO, and performance expectations.
· Analyze test results and identify gaps, risks, and improvement opportunities.
6. Documentation Ownership & Governance
· Own the end-to-end DR documentation framework, including:
o Standards and templates
o Version control and governance
· Develop, maintain, and continuously update:
o Disaster Recovery Plans (DRPs)
o Automation workflows documentation
o Runbooks and operational guides
· Ensure documentation reflects:
o Current application architecture
o Dependency mapping
o Recovery procedures and sequencing
7. Audit, Compliance & Reporting
· Review and ensure compliance with audit and regulatory requirements for application and database DR (e.g., logging, traceability).
· Support audit readiness by ensuring accurate and up-to-date documentation and controls.
8. Cross-Functional Collaboration
· Work closely with Technical Leads, Infrastructure SME, Network SME, Automation teams, and Business Continuity stakeholders.
· Capture and document design decisions, test outcomes, lessons learned, and remediation actions.
· Ensure alignment across all technical streams for integrated DR execution.
9. Continuous Improvement & Optimization
· Identify and drive continuous improvement initiatives to enhance automation coverage, reduce recovery time, and improve reliability.
· Optimize application DR processes to increase efficiency and reduce operational risk.
10. DR Test & Event Documentation Support
· Prepare and maintain documentation packs for DR tests and live events.
· Capture test results, deviations, issues, and improvement actions in post-event reports.
· Ensure all learnings are incorporated into updated DR documentation and automation workflows.
Requirements
· Strong experience in application and database architecture, operations, and DR planning.
· Deep understanding of application dependencies, multi-tier architectures, and database technologies.
· Hands-on experience in DR automation and orchestration tools.
· Experience with RTO/RPO definition, DR testing, and failover design.
· Strong skills in documentation, stakeholder coordination, and technical leadership.
How to apply
To apply for this job you need to authorize on our website. If you don't have an account yet, please register.
Post a resumeSimilar jobs
Director - District Cooling (MAS8-20003609)
Quantity Surveyor-Transportation
Assistant Project Controls Manager - Saudi National