Resume

Jeremy Martinez

Senior Site Reliability Engineer · Incident Commander · Platform Automation

Las Vegas, NV · (720) 310-5673 · mrhits777@gmail.com

Download PDF

Professional summary

Senior Site Reliability Engineer with 20+ years designing, operating, and scaling mission-critical infrastructure across cloud and hybrid environments. Proven leader in automation, observability, incident command, and reliability engineering with a consistent record of eliminating toil, improving uptime, reducing cloud spend, and strengthening production resilience. Deep experience serving as Incident Commander for high-traffic platforms and enterprise operations organizations.

Core skills

Observability, Incident Response & ITSM
Datadog, SysDig, Prometheus, Grafana, Sumo Logic, PagerDuty, Rootly, ServiceNow, Jira
Cloud & Platform Engineering
AWS, Azure, GCP, Kubernetes, OpenShift
Infrastructure as Code & Automation
Terraform, Helm, Ansible, Jenkins, Argo CD
Reliability, Networking & Resilience
Load Balancing, Traffic Engineering, DDoS Mitigation, High Availability, Disaster Recovery
Storage & Distributed Data
MySQL, PostgreSQL, Ceph, NFS, iSCSI, Veritas VCS
Operating Systems
Linux (RHEL, CentOS, Ubuntu), Solaris
Programming & Scripting
Python, Bash, Shell, Perl, PHP, Java

Experience

Dynascale Inc.

03/2024 – Present

Senior Site Reliability Engineer · Incident Commander & Responder

  • Architect and operate highly available cloud platforms across AWS, Azure, and GCP supporting multiple client production environments.
  • Serve as senior Incident Commander for customer and platform incidents — triage, mitigation, escalation, and post-incident remediation.
  • Improved observability through custom alerting pipelines and real-time telemetry integration.
  • Reduced cloud spend through reserved instances, autoscaling optimization, and rightsizing.
  • Automated infrastructure lifecycle with Terraform, CloudFormation, and Ansible — cutting deployment lead time and operational risk.
  • Reduced manual operational intervention by 30% through automation, self-healing workflows, and standardization.
  • Lead disaster recovery strategy: backup validation, failover testing, and incident response playbooks.
  • Developing agentic AI automation pipelines for system administration and self-healing remediation across Hyper-V, AWS, and Azure.
  • Mentor engineers on reliability engineering, automation practices, and production ownership.

Upstart Inc.

07/2022 – 02/2024

Senior Site Reliability Engineer / Incident Commander

  • Served as Incident Commander for enterprise production incidents, coordinating engineering, operations, and executive stakeholders during major outages.
  • Owned Rootly configuration and operational workflows for incident lifecycle management.
  • Implemented standardized incident response processes improving consistency and MTTR.
  • Delivered weekly reliability metrics and incident analytics to executive leadership.
  • Built runbooks, playbooks, and incident simulation exercises to improve organizational readiness.
  • Led blameless postmortems and translated incident insights into durable corrective actions.

eBay Inc.

10/2011 – 07/2022

Production Unix Systems Engineer / MTS / Incident Responder

  • Recognized with a Critical Talent Bonus for high-impact contributions to incident management and operational automation.
  • Senior Incident Responder and escalation owner for large-scale production incidents impacting global e-commerce platforms.
  • Designed automation eliminating 90% of manual operational toil for a 12-person team.
  • Maintained uptime SLA of 99.997% across critical services.
  • Supported availability of a 10,000+ node Hadoop cluster.
  • Administered Veritas VCS clusters supporting high-availability Oracle environments.

New Frontier Media Inc.

08/2008 – 10/2011

Systems Engineer

  • Designed and operated high-traffic streaming platforms supporting national broadcast distribution.
  • Deployed MongoDB, Redis, Nginx, Node.js, VMware, and Xen environments.
  • Implemented clustered MySQL architectures and high-availability outbound e-mail systems.

Hit Director Domains, Comtech, ISSG, The Internet Web Hosting Company

Pre-2008

Earlier Career (Condensed)

  • Built and operated high-traffic Linux server farms and web hosting platforms.
  • Designed secure streaming systems, PCI-compliant infrastructure, and hardened production environments.
  • Led migrations, performance tuning, vulnerability remediation, and infrastructure modernization.

US Army

04/1993 – 08/1997

Communications Center Operator

  • Operated secure telecommunications systems within classified environments (TS/SCI).
  • Maintained satellite and fiber communication systems and cryptographic equipment.
  • Awarded multiple Army Achievement Medals and leadership honors.

Education

Management / Computer Information Systems
Park University, Parkville, Missouri
162 credit hours completed

Certifications

  • Incident Response Certification, PagerDuty (2024)
  • Monitoring AWS Certification, Datadog (2023)
  • Sumo Logic Fundamentals and Search Mastery (2022)
  • Cisco CCNA (expired)
  • Security+