Hiring for Lead Site Reliability Engineer with Java - San Antonio, Texas - Long Term at San Antonio, Texas, USA |
Email: [email protected] |
http://bit.ly/4ey8w48 https://jobs.nvoids.com/job_details.jsp?id=2251669&uid= From: Kevin, DCTS [email protected] Reply to: [email protected] Position: Lead Site Reliability Engineer with Java Location: San Antonio, Texas Duration: Long Term Exp 14+ Years and USC and GC only W2 &1099 Job Description: As a Lead Site Reliability Engineer (SRE), you will leverage your extensive experience in SRE practices to maintain and enhance the reliability, performance, and scalability of mission-critical systems. You will play a crucial role in ensuring the continuous availability and optimal functioning of our services. Key Responsibilities: Senior-Level SRE Expertise: Apply your deep understanding of SRE principles to lead efforts in improving system reliability and operational efficiency. Incident Management: Provide expert-level support during incidents, ensuring swift resolution with minimal service disruption. Lead post-incident reviews to drive continuous improvement. Monitoring & Alerting: Design, implement, and optimize monitoring, alerting, and incident response processes. Ensure the effectiveness of these systems to proactively address potential issues. Automation: Drive the automation of manual processes to enhance operational efficiency, reduce human error, and increase overall system resilience. CI/CD Pipeline Management: Develop, maintain, and improve automated CI/CD pipelines using tools such as GitLab CI/CD and Jenkins, ensuring seamless and reliable deployment processes. Cross-Functional Collaboration: Work closely with cross-functional teams to ensure the reliability, performance, and scalability of our infrastructure. Foster a culture of collaboration and knowledge sharing. Support Across Time Zones: Provide support across all U.S. time zones, with the flexibility to work weekends, rotational shifts, and overtime as required to maintain service continuity. Required Skills & Qualifications: Java Programming: Advanced proficiency in Java, with a deep understanding of contemporary software development practices. Kubernetes & Containerization: Extensive hands-on experience with Kubernetes, including containerization technologies like Docker and Kubernetes storage solutions such as Portworx. Linux/Unix Systems: Strong command of Linux/Unix operating systems and Scripting (BASH), with a focus on system reliability and automation. Functional Programming: Proficiency in functional programming languages such as Prolog, Haskell, and OCaml. Scripting & Automation: Experience with Python or Go, particularly in the context of scripting and automation tasks. Virtualization: In-depth knowledge of VMware and other virtualization platforms, with a focus on optimizing virtual environments for reliability and performance. Streaming Technologies: Expertise with Kafka Stream Generator, KSQLDB, cluster federation, and Spark Streams, including experience in managing and optimizing streaming data architectures. Service Mesh & Networking: Familiarity with Istio and Anthos Service Mesh, with the ability to manage and optimize service meshes for complex environments. Performance Monitoring & Debugging: Proficiency in using EBPF (Extended Berkeley Packet Filter) for performance monitoring and debugging. Monitoring & Logging Tools: Experience with industry-standard monitoring and logging tools such as Splunk, Prometheus, Datadog, and Kiali. Load Balancing: Familiarity with Nginx Controller and Seesaw for effective load balancing and traffic management. Infrastructure-as-Code (IaC): Competence in using Terraform for managing cloud infrastructure, ensuring consistency and scalability across environments. Additional Requirements: Flexibility: Willingness to work weekends, rotational shifts, and provide 24/7 support as necessary to maintain service reliability and meet project deadlines. Certifications Required: Kubernetes Azure Keywords: continuous integration continuous deployment golang green card wtwo Hiring for Lead Site Reliability Engineer with Java - San Antonio, Texas - Long Term [email protected] http://bit.ly/4ey8w48 https://jobs.nvoids.com/job_details.jsp?id=2251669&uid= |
[email protected] View All |
12:54 AM 13-Mar-25 |