We use cookies to enhance your experience on our website. Please read and confirm your agreement to our Privacy Policy and Terms and Conditions before continue to browse our website.
Machine Learning Engineer(3 roles) | Relocation opportunity to Singapore | USD 100K - USD 130K per annum
Our client is a fast-growing, well-funded startup with a mission to make a significant impact in the world through AI based in Singapore. The founders of this venture have a proven track record in building talented teams and fostering innovation. They are looking for Chinese speaking Machine Learning engineer to join their global research team based in Singapore.
The founders have experiences in leading an USD 1B / unicorn valuation start-ups's exit, while at the same time, finding a new venture and these 3 hiring will be reporting directly to the founders. The positions will be based in Singapore.
Core Responsibilities
Implement production-grade LLMs, including fine-tuned or RLHF variants.
Enhance inference performance for latency reduction and cost optimization using techniques like model quantization, FlashAttention, and virtualized LLMs (vLLM).
Develop robust pipelines for training and inference, utilizing frameworks such as DeepSpeed, Accelerate, and Ray.
Create tools for internal teams, enabling efficient multi-GPU training and large-scale experimentation.
Design and manage distributed infrastructures for training and deploying machine learning models, ensuring optimal use of computational resources.
Build cluster management tools to streamline operations across external cloud platforms, spanning multiple vendors.
Establish automated systems for ongoing model evaluation, addressing issues like model drift, performance, and cost efficiency.
Stay ahead of the curve by researching and applying the latest advancements in model compression and inference acceleration.
Minimum Qualifications
A minimum of 3 years' experience in ML engineering, focusing on model deployment and large-scale training.
Knowledge of vector databases (e.g., FAISS, Pinecone, Weaviate) for retrieval-augmented generation (RAG) tasks.
Hands-on expertise in multi-cloud ML deployment across platforms such as AWS, GCP, and Azure.
Proven experience deploying LLMs or comparable large-scale models in real-world settings.
Proficiency in multi-GPU training, model parallelism, and techniques to optimize inference.
A strong grasp of performance tuning for ML systems, including strategies for latency reduction and cost control.
Experience in constructing and maintaining large-scale machine learning clusters in cloud or hybrid environments.
Familiarity with LLM fine-tuning processes, reinforcement learning methodologies, and evaluation metrics.
All applications applied through our system will be delivered directly to the advertiser and privacy of personal data of the applicant will be ensured with security.