

Site Reliability Engineer
YONDU INC.
- Taguig, Philippines7th Floor, Fort Bonifacio, Taguig, Metro Manila, PhilippinesTaguigMetro ManilaPhilippinesPhilippines
- Full timeFULL_TIME
Posted 2 days ago and deadline of application is on 24 Jan
Recruiter was hiring 9 hours ago
2025-11-26T07:27:17.349414+00:002026-01-24T16:00:00+00:00Job Description
Job Description:
• Handle service monitoring, incident response, and drive technical support efficiency
• Responsible for managing and maintaining network monitoring tools, systems, and
processes that ensure the availability, scalability, and performance of our production
environments.
• Responsible for incident handling, service monitoring, and technical support efficiency.
• Closely work with developers, DevOps, infrastructure teams, and different stakeholders
to achieve proactive incident prevention, issue resolution and incident documentations.
Key Responsibilities:
• Ensure that all tickets are updated and handled based on set KPI's and SLA's
• Manage monitoring, alerting, and logging tools to ensure system health and service
uptime.
• Ensure early detection, triage and escalation of service degradation based on defined
service level agreement
• Trigger L2 ticket handling and on-call rotations for critical incidents.
• Execute triage, diagnosis, and resolution of incidents required for L3 escalations, both
internal and 3rd party support teams
• Support major incident response, contribute to root cause analysis (RCA), and help
document postmortems.
• Track, analyze, and act on incident trends and recurring technical issues.
• Use data from ticketing systems (Jira, ServiceNow, etc.) to improve team responsiveness
and resolution quality.
• Update and maintain SOPs, runbooks, and knowledge base articles including the
documentation of known issues, fixes, and playbooks to improve mean time to resolution.
• Collaborate with development and QA teams to improve deployment readiness and
reliability
• Participate in technical competency mapping to ensure coverage and reduce unnecessaryescalations.
Minimum Qualifications
Qualifications and Experience:
• Bachelor's degree in Electronics Engineering, Information Technology, Computer
Science, Management Information Systems, or equivalent.
• 2–5 years of experience in Site Reliability Engineering, DevOps, or Infrastructure roles.
• Minimum of 3 years' experience in Site Reliability Engineering, DevOps, or Infrastructure roles is required.
• Hands-on experience with monitoring tools (e.g., Prometheus, Grafana, ELK, or Datadog).
• Familiarity with incident response and troubleshooting in production systems.
• Experience with at least one cloud platform (AWS, GCP, or Azure).
• Knowledgeable in scripting (e.g., Python, Bash) and Linux systems.
• Exposure to ITIL-based processes, especially Incident and Problem Management.
• Experience working in fintech, banking, or SaaS with high availability SLAs.
• Familiarity with DevOps practices, CI/CD pipelines, and cloud-based monitoring tools.
• Experience with automation platforms
• Knowledge of BSP regulatory frameworks, policies, and guidelines.
Jobs Summary
- Job Level
- Entry Level / Junior, Apprentice
- Job Category
- IT and Software
- Educational Requirement
- Bachelor's degree graduate
- Office Address
- Panorama Tower 34th Street, Taguig, 1634 Metro Manila
Feel secure when applying: look for the verified icon and always do your research on a company. Avoid and report situations when employers require payment or work without compensation as part of their application process.