Machine Learning Data Scientist

Apple • Sunnyvale, California, United States • Full-time

$120,000

per year

Python SQL Machine Learning Computer Vision Data Visualization Performance metrics Data Scientist Model Optimization Failure analysis model evaluation Vision-Language Models Multi-Modal LLM MM-LLM Experiment Design

Job Description

Do you have a passion for computer vision, large language models, and deep learning? The Video Engineering Data Analytics and Quality (DAQ) group is looking for an experienced Data Scientist with a strong background in computer vision, machine learning, and multi-modal LLM (MM-LLM) to join our dynamic team. The ideal candidate will be responsible for evaluating machine learning and MM-LLM models, developing performance metrics, and conducting thorough failure analysis. This role requires a deep understanding of ML algorithms, data processing, model optimization techniques, and modern evaluation approaches for vision-language models.

Description

Our organization supports a diverse array of programs passionate about evaluating ML algorithms and assessing model quality at scale, across domains like computer vision, audio, and multi-modal systems. You will collaborate with multi-functional teams, including domain experts and engineering leads, and adapt methodologies as new insights emerge. In this role you will: - Evaluate ML & MM-LLM Models: Analyze and validate computer vision, multi-modal, and large language models to ensure they meet accuracy, robustness, and usability standards. - Develop Metrics: Design and implement metrics to measure the efficiency and accuracy of models. - Failure Analysis: Conduct in-depth analysis on model failures across CV and MM-LLM pipelines to surface root causes and improvement areas. - Data Processing: Clean, transform, and curate large-scale datasets for model evaluation and benchmarking. - Model Optimization: Apply innovative techniques to optimize models for scalability and real-world deployment. - Collaborate multi-functionally: Work closely with cross-functional teams, including software engineers, product managers, and other data scientists, to integrate models into production. - Communicate Results: Present findings clearly and effectively to collaborators across levels of technical understanding.

Minimum Qualifications

BS in a quantitative field and a minimum of 3 years relevant industry experience. Proven background in data science, machine learning, computer vision and statistical data analysis. Advanced programming skills in data manipulation & processing (SQL & Python preferred). Demonstrated experience in in-depth analysis of machine learning model failures. Experience crafting, conducting, analyzing, and interpreting experiments and investigations. Expertise in data wrangling and developing data visualizations & reporting with toolings such as Tableau, Superset, AWS etc.

Preferred Qualifications

Experience working with multi-modal foundation models such as GPT-4o, Gemini 2.5, Claudi 3/4, LLaVA, Flamingo, etc. Familiar with machine learning interpretability method and standard processes. Exposure to evaluating vision-language models in production or research settings. Experience handling complex programs and collaborating across engineering, product, and data teams. Detail-oriented to keep track of and understand the workings of sophisticated algorithms. Strong attention to detail in working with large datasets and complex ML systems. Curious, self-motivated, and able to drive improvements to model evaluation pipelines and annotation programs. Outstanding communication skills – both written and verbal – with experience presenting to leadership.