Data Science Research Intern
About the Internship
They are looking for a research intern to drive their exploratory initiatives around cutting-edge language models for direct translational applications in biomedicine. This is an amazing opportunity to work in a category defining start-up company.
As a research intern at Elucidata you’ll have the opportunity to work on ground breaking projects involving protein structure and sequence data leveraging genAI tools to deepen insights into protein properties.
Skills Required
- Research enthusiast looking to solve challenging problems in the domain of big data, genAI and protein biology
- Available for 6 months full time internship
- Proficient with basic scripting, visualizations, and module development in Python
- Understand good coding habits: Code management through git, writing modularised, fast, efficient and clean code
- Experience with training and working with LLMs is an advantage
- Hands-on experience in applying computational algorithms and statistical methods to structured and unstructured big data
- Excellent communication and presentation skills
Who can apply
- Pursuing (with full time availability) or completed Masters or B.Tech in Statistics, Bioinformatics, Computational Biology, Computer Science or related technical discipline
- Build automated pipelines, libraries, and modules to scale up solutions
- Communicate the information and results using different visualization methods
- Work in a multi-disciplinary team with biologists, data scientists and data analysts
- Remote/Hybrid work environment
- The opportunity to be part of a dynamic team in a growing company
- Mentorship from experts in biology, bioinformatics, and LLMs, within and potentially from outside Elucidata
- A recommendation from experts at Elucidata maybe be provided based on the contribution to the project
- Meaningful contribution to the project shall result in authorship in any publication communicated by Elucidata
- Location: Bengaluru, Karnataka, India
About Company
Even today Early R&D, Precision Diagnostics and Translational Biomarker teams spend about 80% of their time wrangling data. Elucidata’s mission is to empower scientists in the life sciences field by reclaiming every valuable hour for their research endeavors.
Elucidata’s data harmonization platform – Polly, helps research teams make multi-modal biomedical data Machine Learning ready. Each dataset on Polly is processed consistently using pipelines of your choice, is custom curated with granular annotations and undergoes robust QA/QC checks to ensure highest data quality standards.
Polly transforms multi-modal and multi-source biomedical data (Omics, Assay, Real World Data, Clinical & EHR Data, and CRO data) into a Unified Data Model.With their 10X faster LLM-powered curation and human-in-the-loop model to achieve 99.99% accuracy, we are fast-tracking time to analysis.
Today, Polly is facilitating use case like patient stratification, biomarker discovery, target ID & validation, data management, and development of clinical and commercial pipelines across Pfizer, Janssen Pharmaceuticals, NextGen Jane and IMBDx and over 35 premier biopharma companies and research labs.