Systems Engineer Specialist
ZA
Job Purpose
To provide advanced operational and engineering support for the ELP platform, with a strong focus on Kafka technologies. The role bridges day-to-day platform operations with engineering and architecture work, ensuring stability, scalability, and continuous improvement of the bank’s event-driven and logging platforms. Will support mission-critical platforms used across multiple Banks applications ensuring optimal performance.
Key Responsibilities
- Operational Support & Platform Stability
- Perform daily platform health checks using monitoring dashboards and tools.
- Respond to platform alerts, investigate abnormalities, perform deep-dive analysis, and communicate findings.
- Participate in month-end and critical monitoring activities.
- Manage and action buiness support requests (certificates, access, password resets, troubleshooting).
- Collaborate with the teams during patching cycles, restart pipelines, and validate environment stability.
- Provide operational support for logging and event-driven platforms.
- Engineering & Technical Delivery
- Support the setup and engineering of Kafka topics and configurations for application teams.
- Assist with onboarding requests for Kafka and Elastic.
- Contribute to platform enhancements as the team transitions to cloud-based event-driven architectures.
- Collaborate with vendors to resolve complex technical issues.
- Standby & Incident Management
- Participate in the 24/7 standby rotation once familiar with the environment.
- Serve as the initial point of escalation for critical incidents.
- Work with cross-functional teams to restore services and implement preventative measures.
- Monitoring, Observability & Automation
-
- Use monitoring tools to identify potential performance issues early.
- Drive automation using Ansible or Terraform.
- Enhance platform observability using Dynatrace, Prometheus, Grafana, or Datadog.
Required Skills & Technologies
- Apache Kafka experience - Required
- Elastic / ELK Stack (Elasticsearch, Logstash, Kibana) - Advantage
- Monitoring tools: Dynatrace, Prometheus, Grafana, Datadog
- Automation and IaC: Ansible, Terraform
- Container & cloud: AKS, OpenShift
- Linux proficiency
Minimum Experience Level
- Minimum 3–5 years in a Systems Engineering role.
- Experience with Kafka.
- Broader IT experience acceptable if paired with Kafka expertise.