AI Inference Engineer - Speech
Company: Zoom
Location: Seattle
Posted on: April 1, 2026
|
|
|
Job Description:
What you can expect We are looking for an AI Inference Engineer
with a solid background in speech recognition and model inference.
In this role, you will develop state-of-the-art automatic speech
recognition system and ship it to various Zoom products. You will
work on the most cutting edge speech modeling and inference
technologies with world-class speech scientists. This role will
include collaboration with cross-functional teams, including
product, science engineering teams, and infrastructure teams, to
deliver high-impact projects from the ground up. About the Team
Zoom's AI Speech Team is developing speech recognition technologies
to improve Zoom's conversational AI experience. This work impacts
various products, like Zoom AI Companion, Zoom Meetings and
Workplace, Zoom Contact Center, Zoom Phone, Zoom Revenue
Accelerator, etc. Our team's mission is to equip the powerful AI
brain with human-level listening and understanding undefined for
voice input. As an AI Inference Engineer, you will develop novel
speech model inference solutions on modern AI inference hardware,
such as GPU, TPU and AI-specific chips. Our goal is to deliver the
most unique AI-powered collaboration platform to users across the
globe. Responsibilities Developing state-of-the-art speech services
for Zoom products. Devising novel techniques where off-the-shelf
solutions are not available. Optimizing ASR inference systems for
production deployment, including inference latency, throughput,
memory footprint, and resource utilization. Optimizing model
inference performance by diving deep into the lower stack of
inference frameworks, with a focus on hardware-specific
optimizations for Nvidia GPUs. Proposing new model structures by
joint optimization of model accuracy and inference speed. Designing
and developing ASR systems with low latency and high accuracy
requirements, while ensuring scalability of GPU infrastructure and
improving throughput of ASR service. Profiling and debugging ASR
runtime performance bottlenecks across different deployment
hardware and environments. What we’re looking for Possess a
Master's in Computer Science, Electrical Engineering or related
fields with 3 years of experience in speech recognition, speech-llm
or AI model inference. Display knowledge in deep learning and
hands-on programming skills in Python, shell scripts, C/C++;
familiarity with ML frameworks such as PyTorch and TensorFlow.
Demonstrate deep understanding of transformer encoder-decoder
frameworks for speech recognition, including attention mechanisms,
beam search and sequence-to-sequence modeling for end-to-end ASR
systems. Understand recent advancements in speech foundation models
and speech-LLMs that integrate acoustic and linguistic
representations, enabling unified modeling for speech understanding
and transcription tasks. Have experience in optimizing deep
learning model inference on NVIDIA GPUs, including profiling and
accelerating AI models using CUDA, TensorRT, and mixed-precision
computation to achieve low latency, high-throughput performance.
Have experience developing and tuning custom CUDA kernels,
leveraging CUDA Graphs for efficient execution scheduling, and
minimizing kernel launch overhead to maximize GPU utilization. Be
proficient in end-to-end performance analysis, memory optimization,
and deployment of largescale ML models on GPU clusters. Experienced
with stream management, asynchronous execution, and integrating
frameworks such as PyTorch and TensorFlow for real-time inference.
Salary Range or On Target Earnings: Minimum: $151,800.00 Maximum:
$332,200.00 In addition to the base salary and/or OTE listed Zoom
has a Total Direct Compensation philosophy that takes into
consideration; base salary, bonus and equity value. Note: Starting
pay will be based on a number of factors and commensurate with
qualifications & experience. We also have a location based
compensation structure; there may be a different range for
candidates in this and other locations At Zoom, we offer a window
of at least 5 days for you to apply because we believe in giving
you every opportunity. Below is the potential closing date, just in
case you want to mark it on your calendar. We look forward to
receiving your application! Anticipated Position Close Date:
04/01/26 Ways of Working Our structured hybrid approach is centered
around our offices and remote work environments. The work style of
each role, Hybrid, Remote, or In-Person is indicated in the job
description/posting. Benefits As part of our award-winning
workplace culture and commitment to delivering happiness, our
benefits program offers a variety of perks, benefits, and options
to help employees maintain their physical, mental, emotional, and
financial health; support work-life balance; and contribute to
their community in meaningful ways. Click Learn for more
information. About Us Zoomies help people stay connected so they
can get more done together. We set out to build the best
collaboration platform for the enterprise, and today help people
communicate better with products like Zoom Contact Center, Zoom
Phone, Zoom Events, Zoom Apps, Zoom Rooms, and Zoom Webinars. We’re
problem-solvers, working at a fast pace to design solutions with
our customers and users in mind. Find room to grow with
opportunities to stretch your skills and advance your career in a
collaborative, growth-focused environment. Our Commitment? At Zoom,
we believe great work happens when people feel supported and
empowered. We’re committed to fair hiring practices that ensure
every candidate is evaluated based on skills, experience, and
potential. If you require an accommodation during the hiring
process, let us know—we’re here to support you at every step. If
you need assistance navigating the interview process due to a
medical disability, please submit an Accommodations Request Form
and someone from our team will reach out soon. This form is solely
for applicants who require an accommodation due to a qualifying
medical disability. Non-accommodation-related requests, such as
application follow-ups or technical issues, will not be
addressed.
Keywords: Zoom, Auburn , AI Inference Engineer - Speech, IT / Software / Systems , Seattle, Washington