Required SRE - Onsite 3x per week in Plano, TX [No H1B] [Rate on W2-1099 only] at Plano, Texas, USA |
Email: [email protected] |
http://bit.ly/4ey8w48 https://jobs.nvoids.com/job_details.jsp?id=1918323&uid= From: Bhavya Nikhil, KPG99 Inc. [email protected] Reply to: [email protected] Hi, Hope you are doing well. Please find the Job description below and let me know your interest. Position : SRE Location : Onsite 3x per week in Plano, TX Duration : 6+ months Mode of interview : Video Job description : Monitoring alerts coming into ServiceNow- this candidate should be able to work independently without too much guidance Should be able to do some root cause analysis on issues / proactive approach to solving problems Previous SRE team was working a queue in SNOW Getting alerts that are created first so that they are aware of what is going on in the environment Using different monitoring tools- like LogicMonitor, ServiceNow, new relic, app insights - all of these would be good to have Likely guiding/ mentoring these resources- should be senior enough to work with offshore resources & provide guidance Do use linux- minor in environment / Linux as Kafka connectors from on prem to Kafka PLUS to have, not required Will be some on-call rotation *** will be required/ do not know exact schedule yet but there is an offshore team for off hours, likely some weekends Job Overview As an SRE, you will be responsible for maintaining the reliability, availability, and performance of our systems and services. You will work closely with software engineering teams to build and run scalable, resilient systems while improving operational processes and automating tasks. You will be actively participating in monitoring, L1 and L2 resolutions as part of on-call rotations. Key Responsibilities: Monitor system performance, troubleshoot issues, and resolve incidents to ensure optimal up-time. Develop and implement automation scripts and tools to streamline operations and improve efficiency. Analyze system capacity and performance metrics to plan for future growth and scalability. Create and maintain documentation for systems, processes, and incidents. Conduct postmortem reviews after incidents to identify root causes and implement corrective actions. Participate in on-call rotations to respond to outages and performance issues. Required Qualifications: Bachelors degree in Computer Science, Engineering, or a related field. At least 5 years of experience as Reliability engineer responsible for stability of enterprise applications and databases. Strong experience with Windows and Linux/Unix systems administration. Proficiency in at least one programming/scripting language (e.g., Python, .NET, JAVA). Experience with cloud platforms (e.g., AWS, Azure) Familiarity with monitoring tools (e.g., Dynatrace, Prometheus, Grafana ) and incident management tools (e.g., PagerDuty, ServiceNow ). Solid understanding of networking concepts and protocols. Excellent problem-solving skills and attention to detail. Preferred Qualifications: Experience in a DevOps or SRE role. Knowledge of CI/CD processes and tools. Understanding database technologies (e.g., SQL Server DB, MySQL, PostgreSQL, NoSQL). Familiarity with security best practices. NOTE: If I miss answering over the phone The BEST way to communicate is through an EMAIL or LinkedIn. Bhavya Nikhil I Technical Recruiter I KPG99 Inc. Direct (609-236-2769) I [email protected] LinkedIn : https://www.linkedin.com/in/bhavya-nikhil-kujur-418511149/ htt:// www.kpg99.com/ 3240 E State St EXT, Hamilton, Nj 08619 Keywords: continuous integration continuous deployment database New Jersey Texas Required SRE - Onsite 3x per week in Plano, TX [No H1B] [Rate on W2-1099 only] [email protected] http://bit.ly/4ey8w48 https://jobs.nvoids.com/job_details.jsp?id=1918323&uid= |
[email protected] View All |
08:21 PM 11-Nov-24 |