Staff Site Reliability Engineer
Henderson, NV 
Share
Posted 1 day ago
Job Description

Credit Acceptance is proud to be an award-winning company with local and national workplace recognition in multiple categories! Our world-class culture is shaped by dedicated Team Members who share a drive to succeed as professionals and together as a company. A great product, amazing people and our stable financial history have made us one of the largest used car finance companies nationally.

Our Engineering and Analytics Team Members utilize the latest technology to develop, monitor, and maintain complex practices that help optimize our success. Our Team Members value being challenged, are encouraged to express their ideas, and have the flexibility to enjoy work life balance. We build intrinsic value by partnering with all functions of our business to support their success and make strategic business decisions. We focus on professional development and continuous improvement while enjoying a casual work environment and Great Place to Work culture!

At Credit Acceptance we believe in the power to change lives! We're working to be the best world class, customer-centric fin-tech company on earth. To get there, we need especially qualified, self-motivated and hard-working people with a knack for solving technology problems.

We are seeking a talented and experienced Staff Site Reliability Engineer to join our dynamic and innovative team. As a Staff Site Reliability Engineer, you will play a crucial role in ensuring the reliability, availability, and performance of our software systems. You will collaborate with cross-functional teams to design, implement, and maintain robust systems, monitoring tools, and processes. The ideal candidate will have a strong background in software development, system architecture, and a passion for creating reliable and scalable software solutions.

Join our team and contribute to building and maintaining highly reliable software systems that power our cutting-edge solutions! Apply now by submitting your resume and cover letter.

Outcomes and Activities:

  • System Architecture and Design:

  • Collaborate with software engineers, architects, and operations teams to design highly reliable and scalable systems.
  • Evaluate existing systems and propose improvements to enhance reliability, performance, and availability.

  • Implementation and Coding:

  • Develop and implement code to automate operational processes and tasks to improve system reliability and efficiency.
  • Create and maintain scripts and tools for monitoring, logging, and alerting.

  • Monitoring and Incident Response:

  • Implement and manage monitoring solutions to proactively identify and address reliability issues.
  • Participate in on-call rotations and respond to incidents promptly to minimize downtime.

  • Performance Analysis and Optimization:
  • Conduct performance analysis to identify bottlenecks and optimize system performance.
  • Work with development teams to address performance-related issues in the codebase.

  • Capacity Planning:

  • Collaborate with capacity planning teams to ensure systems can handle anticipated growth and demand.
  • Proactively identify capacity-related challenges and propose solutions.

  • Documentation:

  • Maintain comprehensive documentation for system configurations, processes, and procedures.
  • Contribute to knowledge sharing within the team and across departments.

Competencies: The following items detail how you will be successful in this role.

  • This position will work from home; occasional planned travel to an assigned Southfield, Michigan office location may be required. However, this position is permitted to work at a Southfield, Michigan office location if requested by the team member
  • Development: Develops solutions using standards and best practices of the applications language. Writes code that implements the design that is testable, extensible, efficient and maintainable.
  • Impact Analysis: Understand the rationale behind and how changes impact the enterprise and/or applications and across the technical ecosystem.
  • Solution Design: Ability to translate high level requirements to create and implement designs that meet the needs of the customer, are technically sound, maintainable and cost effective. Ability to identify missing or ambiguous requirements. Ability to design at both high and low levels of abstraction, understand complex requirements and translate into understandable solutions. Ability to accurately estimate based on requirements.
  • Technical Domain: Have an understanding of the technical domain, including the application architecture, design and data of the application they support and systems to which it interfaces.
  • Facilitation Techniques: Organize, support and/or conduct workshops, meetings, presentations specific to the objectives of each, problem to be solve, and needs of the audience.

Requirements:

  • Bachelor's or Master's degree in Computer Science, Information Technology, or a related field.
  • Minimum 10 years as a Site Reliability Engineer or similar role.
  • Strong programming and scripting skills (e.g., Python, Shell, Java).
  • In-depth knowledge of system architecture, distributed systems, and networking.
  • Experience with cloud platforms (e.g., AWS, Azure, GCP) and containerization technologies (e.g., Docker, Kubernetes).
  • Proficiency in using monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
  • Familiarity with continuous integration and continuous deployment (CI/CD) practices.
  • Certification in relevant areas (e.g., AWS Certified DevOps Engineer, Kubernetes Certified Administrator) is a plus.

Preferred Knowledge and Skills:

  • Excellent troubleshooting and problem-solving skills
  • Strong communication and collaboration skills
  • Lead the design and implementation of our cloud migration strategy leveraging your Linux expertise to ensure a seamless transition.
  • Champion the adoption of SRE principles across the organization, leading initiatives to educate and mentor teams on reliability best practices while promoting a culture of reliability and continuous improvement.
  • Spearhead incident response initiatives employing robust problem solving skills to minimize downtime and maintain system availability.
  • Promote a culture of blameless postmortems and learning from failures, fostering a continuous improvement mindset within the organization.
  • Mentor junior engineers sharing your Linux, Cloud migration, Observability, and Incident response knowledge to foster a culture of technical excellence.

Targeted Total Compensation: $187,000 - $313,750. Total compensation is comprised of a competitive base salary and an annual variable compensation package.

INDENGMP

#zip

#LI-Remote

Benefits

  • Excellent benefits package that includes 401(K) match, adoption assistance, parental leave, tuition reimbursement, comprehensive medical/ dental/vision and many nonstandard benefits that make us a Great Place to Work

Our Company Values:

To be successful in this role, Team Members need to be:

  • Positive by maintaining resiliency and focusing on solutions
  • Respectful by collaborating and actively listening
  • Insightful by cultivating innovation, accumulating business and role specific knowledge, demonstrating self-awareness and making quality decisions
  • Direct by effectively communicating and conveying courage
  • Earnest by taking accountability, applying feedback and effectively planning and priority setting

Expectations:

  • Remain compliant with our policies processes and legal guidelines
  • All other duties as assigned
  • Attendance as required by department

Advice!

We understand that your career search may look different than others. Our hiring team wants to make sure that this would be a fit not just for us, but for you long term. If you are actively looking or starting to explore new opportunities, send us your application!

P.S.

We have great details around our stats, success, history and more. We're proud of our culture and are happy to share why - let's talk!

Required degrees must have been earned at institutions of Higher Education which are accredited by the Council for Higher Education Accreditation or equivalent.

Credit Acceptance is dedicated to providing a safe and inclusive working environment for all. As part of our Culture of Compliance, we are proud to be an Equal Opportunity Employer and value our culturally diverse workforce. All qualified applicants will receive consideration for employment regardless of the person's age, race, color, religion, sex, gender, sexual orientation, gender identity, national origin, veteran or disability status, criminal history, or any other legally protected characteristic.

California Residents: Please click for the California Consumer Privacy Act (CCPA) notice regarding the personal information Credit Acceptance may collect from you.

Play the video below to learn more about our Company culture.


Credit Acceptance is dedicated to providing an inclusive environment for all. We are proud to be an Equal Opportunity Employer and value a culturally diverse workforce. We believe in ensuring all team members demonstrate mutual respect for one another. All qualified applicants will receive consideration for employment without regard to protected characteristics like age, race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status.

 

Job Summary
Start Date
As soon as possible
Employment Term and Type
Regular, Full Time
Required Education
Bachelor's Degree
Required Experience
Open
Email this Job to Yourself or a Friend
Indicates required fields