Senior Deep Learning Architect, LLM Inference
Nvidia · JR2013930
We are now looking for a Senior Deep Learning Architect, LLM Inference! NVIDIA is at the forefront of the generative AI revolution. The Inference Benchmarking (IB) team specifically focuses on inference server performance optimization for Large Language Models (LLMs). If you're passionate about pushing the boundaries of GPU hardware and software performance and understand terms like disaggregated serving, data parallel attention, MoE, Qwen3.5, DeepSeek, GPT-OSS, then this is a great role for you…
Apply on original site