Workato transforms technology complexity into business opportunity. As the leader in enterprise orchestration, Workato helps businesses globally streamline operations by connecting data, processes, applications, and experiences. Its AI-powered platform enables teams to navigate complex workflows in real-time, driving efficiency and agility.Trusted by a community of 400,000 global customers, Workato empowers organizations of every size to unlock new value and lead in today’s fast-changing world. Learn how Workato helps businesses of all sizes achieve more at workato.com.Why join us?Ultimately, Workato believes in fostering a flexible, trust-oriented culture that empowers everyone to take full ownership of their roles. We are driven by innovation and looking for team players who want to actively build our company. But, we also believe in balancing productivity with self-care. That’s why we offer all of our employees a vibrant and dynamic work environment along with a multitude of benefits they can enjoy inside and outside of their work lives. If this sounds right up your alley, please submit an application. We look forward to getting to know you!Also, feel free to check out why:Business Insider named us an “enterprise startup to bet your career on”Forbes’ Cloud 100 recognized us as one of the top 100 private cloud companies in the worldDeloitte Tech Fast 500 ranked us as the 17th fastest growing tech company in the Bay Area, and 96th in North AmericaQuartz ranked us the #1 best company for remote workersResponsibilitiesWe are seeking an experienced Data Science / Machine Learning Engineering Lead to join our team and drive the development of advanced ML/AI capabilities. You will lead a team of Data Scientists / ML Engineers, focusing on building and deploying cutting-edge machine learning solutions using our modern ML infrastructure including Anthropic, OpenAI, and self-hosted LLMs.Team Leadership & ManagementLead, mentor, and develop a team Data Scientists, Data Engineers, ML EngineersConduct regular 1:1s, performance reviews, and career development planningFoster a collaborative, innovative team culture focused on continuous learningCoordinate work allocation and ensure timely delivery of projectsFacilitate knowledge sharing and best practices across the teamTechnical LeadershipDesign and implement scalable ML model training pipelines using modern toolset (e.g MLflow, Comet, Langfuse, WandB, Trino, dbt, Spark, Flink, etc)Lead fine-tuning initiatives for both commercial (Anthropic Claude, OpenAI GPT) and open-source LLMsUtilise self-hosted LLM infrastructure using Ray, AIBrix, and vLLM for optimal performance and cost efficiency with Lora/QLora Architect and oversee model continous validation frameworks within our ecosystemDevelop real-time anomaly detection systems leveraging for streaming data processingBuild predictive models for system performance, usage patterns, and automation workflow optimizationEstablish ML engineering best practices for model versioning, monitoring, and deployment on KubernetesCreation of eval, validation and metrics pipelines for models during training and inferenceStrategic InitiativesOptimize the balance between commercial APIs (Anthropic, OpenAI) and self-hosted models for different use casesPartner with product and engineering teams to identify high-impact ML opportunitiesDefine the team's technical roadmap aligned with company objectivesDrive adoption of state-of-the-art ML techniques and toolsContribute to infrastructure decisions for scaling our ML platformOperational ExcellenceImplement robust CI/CD pipelines for ML models in Kubernetes environmentsMonitor model performance using MLflow tracking and implement drift detectionManage Flink jobs for real-time feature engineering and anomaly detectionDocument processes, architectures, and decision rationaleRequirementsQualifications / Experience / Technical SkillsEducation & ExperienceMaster's or PhD in Computer Science, Machine Learning, Statistics, or related field10+ years of hands-on experience in data science/machine learning5+ years of experience leading technical teamsProven track record of deploying ML & LLM models to production at scaleTechnical SkillsDeep expertise in Python and ML frameworks (PyTorch, TensorFlow)Extensive experience with commercial LLM APIs (Anthropic Claude, OpenAI GPT-4)Strong proficiency with MLflow for experiment tracking and model managementExperience with distributed computing using Apache SparkProficiency with Apache Flink for stream processing and real-time MLKnowledge of LLM fine-tuning techniques (LoRA, QLoRA, full fine-tuning)Expertise in anomaly detection algorithms and time series analysisLeadership SkillsDemonstrated ability to lead and inspire technical teamsStrong communication skills to translate complex technical concepts to stakeholdersExperience with agile development methodologiesTrack record of successful cross-functional collaborationAbility to balance technical excellence with business pragmatismSoft Skills / Personal CharacteristicsExperience with AIBrix, vllm or similar ML platform solutionsExperience with AI code generation and anonymisation pipelinesKnowledge of advanced prompting techniques and prompt engineeringExperience building RAG (Retrieval Augmented Generation) systemsBackground in building ML platforms or infrastructureFamiliarity with vector databases (Pinecone, Weaviate, Qdrant)Experience with model security and responsible AI practicesContributions to open-source ML projects