Vice President of AI Infrastructure & Engineering

Nanjing Taorui Business Management Consulting Co., Ltd.

San Francisco County, California 1 day ago ✦ New

Job Description

Reporting to: CEO

Position Overview

*Build and Overscale Training Platform:* Design and maintain high-performance training architecture supporting GPU clusters at the scale of tens of thousands of cards.

*Key Focus:* Establish a unified task scheduling and model management system (MLOps). All model training, checkpoint storage, and code version control must be exclusively conducted through this platform.

*Engineering-Led Data Governance:*

Develop end-to-end data processing pipelines covering cleansing, annotation, versioning, and secure storage.

*Key Focus:* Implement rigorous data access audit logs and data lineage tracking to ensure security and compliance of core data assets.

*Developer Experience as Control Mechanism:*

Maximize experimental efficiency through standardized toolchains, enabling researchers to focus solely on algorithm development without managing underlying configurations.

*Key Focus:* Codify "best practices" into reusable code templates.

*System Reliability and Disaster Recovery:*

Implement fault tolerance, automated model snapshot archiving, and continuity protocols to prevent loss of critical assets due to single points of failure (hardware or human).

Qualifications

*Background:* 10+ years in distributed systems, cloud computing, or high-performance computing (HPC), with prior experience in core infrastructure teams at leading firms such as Google, Meta, AWS, or NVIDIA.

*Mindset:* Exceptional engineering rigor with a focus on building stable, scalable systems rather than solely pursuing algorithmic innovation. Service-oriented attitude with a commitment to empowering top-tier scientists.

*Technical Skills:* Proficiency in orchestration systems such as Kubernetes, Ray, or Slurm; familiarity with PyTorch distributed training frameworks; deep understanding of data security and access control mechanisms.

More Jobs at Nanjing Taorui Business Management Consulting Co., Ltd.

All Nanjing Taorui Business Management Consulting Co., Ltd. Jobs Nanjing Taorui Business Management Consulting Co., Ltd. in San Francisco County, California

Vice President of AI Infrastructure & Engineering Jobs in San Francisco County, California

Trinity Consultants Engineering Ehs Workforce Solutions Jobs in San Francisco County, California All Jobs in San Francisco County, California

More Trinity Consultants Engineering Ehs Workforce Solutions Jobs in Usa

Trinity Consultants Engineering Ehs Workforce Solutions Jobs in New York Trinity Consultants Engineering Ehs Workforce Solutions Jobs in Los Angeles Trinity Consultants Engineering Ehs Workforce Solutions Jobs in Chicago Trinity Consultants Engineering Ehs Workforce Solutions Jobs in Houston Trinity Consultants Engineering Ehs Workforce Solutions Jobs in Phoenix Trinity Consultants Engineering Ehs Workforce Solutions Jobs in Philadelphia All Trinity Consultants Engineering Ehs Workforce Solutions Jobs in the USA

Job source Vice President of AI Infrastructure & Engineering — Nanjing Taorui Business Management Consulting Co., Ltd. in San Francisco County, California