Senior Machine Learning Engineer - (Omics and Graph Intelligence)
BenchSci
You Will:
- Build and deploy machine learning models over omics data (e.g. genomics, transcriptomics, proteomics, epigenomics, and multi-omics), capturing biological structure, variability, and experimental context.
- Work with major omics and biomedical databases (e.g. gene, protein, pathway, interaction, and expression resources) to integrate heterogeneous biological signals into unified learning pipelines.
- Develop and apply foundation models for biological data, including sequence-based, expression-based, and multi-modal models, adapting them to downstream scientific and product use cases.
- Design ML systems that populate, enrich, and reason over a biological knowledge graph, connecting entities such as genes, proteins, pathways, phenotypes, diseases, and experimental evidence.
- Apply graph-based methods tailored to biology, including graph embeddings, message passing, and network-aware learning, to model molecular interactions and biological systems.
- Collaborate with BenchSci’s Science team to ensure models reflect biological constraints, experimental design, and domain nuance, not just statistical patterns.
- Power downstream experiences by surfacing insights through semantic search, recommendation, and conversational AI / chat-based scientific assistants.
- Improve scalability, robustness, and interpretability of models operating on large, sparse, noisy, and biased omics datasets.
- Lead technical decision-making within the ML team, mentor other engineers, and help define best practices for applied ML in biomedical settings.
- Own projects end-to-end, from data exploration and model prototyping to production deployment and monitoring.
- Continuously improve the performance and scalability of ML models that are at the core of BenchSci’s products
- Regularly investigating what technologies will best enable BenchSci to effectively generate use cases
- Advocate for code and process improvements across yourteam, and help to define best practices based on personal industry experience and research
- Participate in sprint planning, estimation and reviews. Take ownership of deliverables, and work with teammates to ensure high-quality deliverables
You Have:
- Bachelor’s degree or higher in Computer Science, Mathematics, Machine Learning, Bioinformatics, or a related field.
- Leadership: 2+ years of tech lead experience in a production ML environment.
- Hands-on experience working with omics data and omics derived resources, such as genomic sequences, expression matrices, protein data, or biological networks.
- Familiarity with omics and biomedical databases (e.g. gene/protein annotations, interaction networks, pathway databases, expression atlases).
- Experience with or strong interest in biological foundation models, such as sequence models, embedding models, or multi-modal models applied to molecular or cellular data.
- Solid understanding of graph methods in a biological context, including knowledge graphs, molecular interaction networks, or pathway-level representations.
- Experience applying NLP or LLM-based techniques to scientific text or integrating text-based evidence with structured biological data.
- Strong experience with TensorFlow, PyTorch, and Omics processing libraries.
- Comfort working across disciplines, collaborating closely with scientists, engineers, and product teams.
- A team player who strives to see teammates succeed together.
- A growth mindset, strong ownership mentality, and desire to work on scientifically meaningful problems.
- You have a constant desire to grow and develop.
Nice to Have:
- Research publications in ML, AI or bioinformatics.
- Experience with multi-omics integration or cross-modal biological learning.
- Prior work on biomedical knowledge graphs, graph neural networks, or hybrid symbolic-neural systems.
- Experience deploying ML models into production systems used by scientists.
- Experience building AI-powered scientific assistants or chat-based analytical tools.
