Sr. System Development Engineer, High-Performance Accelerator Servers for AI/ML
Company: Amazon
Location: Seattle
Posted on: April 1, 2026
|
|
|
Job Description:
Do you want to shape the future of Generative AI at AWS? Join
the team building the foundation of the world’s most advanced cloud
for AI training and inference — where multi-billion-parameter
models come to life at scale. Here, you’ll design, deliver, and
operate next-generation infrastructure that powers breakthrough
innovation in AI/ML and HPC workloads. If you’re passionate about
pushing the limits of performance, efficiency, and scalability in
the cloud, this is your opportunity to build the systems that
define what’s next for AWS — and for the entire AI industry. You’ll
join a diverse AWS Hardware Engineering team of software, hardware,
and network engineers, supply chain specialists, security experts,
operations managers, and other vital roles. You’ll collaborate with
people across AWS to help us deliver the highest standards for
safety and security while providing seemingly infinite capacity at
the lowest possible cost for our customers. And you’ll experience
an inclusive culture that welcomes bold ideas and empowers you to
own them to completion. The ideal candidate for this role will be
an innovative self-starter. You are knowledgeable of the full
technical stack - vertically from baremetal server hardware up to
the software in userland, and everything in the middle. You have
tremendous interest in cloud scale and curious how systems and
software decisions impact the user. You insist on highest-standards
and are able to develop tactical solutions/tools to diagnose and
fix issues. You are an excellent systems debugger - finding
interaction issues between components on server systems. You are a
leader with strong organizational, planning, and communication
skills. You are a builder! Key job responsibilities You will be a
technical leader solving complex problems. You will decompose big
difficult server system testability, reliability and diagnosis
problems into straightforward tasks, components or features that
you will lead to deliver yourself and through others in parallel.
You will use combination of hardware, software, system designs, x86
architecture, processes, diagnosis and operations knowledge. A day
in the life Working with a variety of job roles (SDEs, SDETs,
Hardware Engineers, TPMs, Managers, Principals) and groups (AWS
Hardware Engineering, EC2, other AWS services) through server
conception, design, test, launch, and operations. Driving high
quality and reliability into future/new designs for AWS Accelerated
server solutions for AWS Cloud. About the team *Why AWS* Amazon Web
Services (AWS) is the world’s most comprehensive and broadly
adopted cloud platform. We pioneered cloud computing and never
stopped innovating — that’s why customers from the most successful
startups to Global 500 companies trust our robust suite of products
and services to power their businesses. *Diverse Experiences*
Amazon values diverse experiences. Even if you do not meet all of
the preferred qualifications and skills listed in the job
description, we encourage candidates to apply. If your career is
just starting, hasn’t followed a traditional path, or includes
alternative experiences, don’t let it stop you from applying.
*Work/Life Balance* We value work-life harmony. Achieving success
at work should never come at the expense of sacrifices at home,
which is why we strive for flexibility as part of our working
culture. When we feel supported in the workplace and at home,
there’s nothing we can’t achieve in the cloud. *Inclusive Team
Culture* Here at AWS, it’s in our nature to learn and be curious.
Our employee-led affinity groups foster a culture of inclusion that
empower us to be proud of our differences. Ongoing events and
learning experiences, including our Conversations on Race and
Ethnicity (CORE) and AmazeCon (gender diversity) conferences,
inspire us to never stop embracing our uniqueness. *Mentorship and
Career Growth* We’re continuously raising our performance bar as we
strive to become Earth’s Best Employer. That’s why you’ll find
endless knowledge-sharing, mentorship and other career-advancing
resources here to help you develop into a better-rounded
professional. - 4 years of non-internship professional software
development experience - 4 years of deploying and operating in a
Linux/Unix environment experience - 4 years of systems development
in an IT or data center environment experience - 3 years of
programming with at least one modern language such as C++, C#,
Java, Python, Golang, PowerShell, Ruby experience - 2 years of
designing or architecting (design patterns, reliability and
scaling) of new and existing systems experience - 2 years of
systems design, software development, operations, automation, and
process improvement experience - Experience leading the design,
build and deployment of complex and performant (reliable and
scalable) software solutions in production - 3 years of
development/programming/scripting language (Python/Java/Bash/Perl)
experience - Knowledge of engineering practices and patterns for
the full software/hardware/networks development life cycle,
including coding standards, code reviews, source control
management, build processes, testing, certification, and livesite
operations - Experience taking a leading role in building complex
software or computing infrastructure that has been successfully
delivered to customers - Experience debugging, integrating, and
validating complex AI/ML and Cloud Computing servers. Amazon is an
equal opportunity employer and does not discriminate on the basis
of protected veteran status, disability, or other legally protected
status. Los Angeles County applicants: Job duties for this position
include: work safely and cooperatively with other employees,
supervisors, and staff; adhere to standards of excellence despite
stressful conditions; communicate effectively and respectfully with
employees, supervisors, and staff to ensure exceptional customer
service; and follow all federal, state, and local laws and Company
policies. Criminal history may have a direct, adverse, and negative
relationship with some of the material job duties of this position.
These include the duties and responsibilities listed above, as well
as the abilities to adhere to company policies, exercise sound
judgment, effectively manage stress and work safely and
respectfully with others, exhibit trustworthiness and
professionalism, and safeguard business operations and the
Company’s reputation. Pursuant to the Los Angeles County Fair
Chance Ordinance, we will consider for employment qualified
applicants with arrest and conviction records. Our inclusive
culture empowers Amazonians to deliver the best results for our
customers. If you have a disability and need a workplace
accommodation or adjustment during the application and hiring
process, including support for the interview or onboarding process,
please visit
https://amazon.jobs/content/en/how-we-hire/accommodations for more
information. If the country/region you’re applying in isn’t listed,
please contact your Recruiting Partner. Our compensation reflects
the cost of labor across several US geographic markets. The base
pay for this position ranges from $136,100/year in our lowest
geographic market up to $235,200/year in our highest geographic
market. Pay is based on a number of factors including market
location and may vary depending on job-related knowledge, skills,
and experience. Amazon is a total compensation company. Dependent
on the position offered, equity, sign-on payments, and other forms
of compensation may be provided as part of a total compensation
package, in addition to a full range of medical, financial, and/or
other benefits. For more information, please visit
https://www.aboutamazon.com/workplace/employee-benefits . This
position will remain posted until filled. Applicants should apply
via our internal or external career site.
Keywords: Amazon, Auburn , Sr. System Development Engineer, High-Performance Accelerator Servers for AI/ML, IT / Software / Systems , Seattle, Washington