Skip to main content

This job has expired

Data Curation Engineer

Employer
Calico Life Sciences
Location
South San Francisco, CA
Closing date
Dec 6, 2022
Data Curation Engineer - Development Program

Who we are:
Calico is a research and development company whose mission is to harness advanced technologies to increase our understanding of the biology that controls lifespan, and to devise interventions that enable people to lead longer and healthier lives. Executing on this mission will require an unprecedented level of interdisciplinary effort and a long-term focus for which funding is already in place.

Position Description
As a Data Curation Engineer, you will work closely with Calico and partner scientists to help store and provide access to large, complex, and diverse biological datasets. You will develop schemas to accurately capture and document experimental results and methods at an appropriate technical level. You will advise and assist scientists and data scientists in best practices for biological metadata management. Specifically, you will be working with importing a large data set from a research collaborator which includes genomics and proteomics data. You will be primarily responsible for working with the data creators to transmit the data to Calico along with all of the metadata that will be required both to search and to interpret that data. You must be able to learn and work independently, yet collaborate well with coworkers and share their passion to advance Calico's quest to understand aging and age-related disease.

Responsibilities

* Work with scientists to identify optimal ways to prepare, annotate, store and navigate their datasets
* Work with software & information technology teams to specify, design, and implement the infrastructure for storing, searching, visualizing and integrating experimental datasets
* Define and document best practices for capturing and entering experimental metadata, and educate scientists and collaborators about these standards
* Assist labs in data and metadata submission
* Write scripts to submit and verify data and metadata
* Track the flow of data within ETL and analysis pipelines, ensuring successful processing and data validity

Requirements

* 3+ years experience curating (organizing, cleaning and efficiently manipulating) scientific datasets
* Advanced knowledge of biology (degree in life sciences or computational biology, and/or experience working in a biology lab environment)
* Experience with content management systems e.g. SharePoint, OneNote
* Experience with curating data within LIMS systems e.g. Benchling, FreezerPro
* Detail-oriented with strong organizational, project management and analytical skills
* Ability to work effectively with scientists to elucidate and translate data organization needs into written requirements and specifications
* Ability to understand scientific literature, experimental procedures and their limitations, and current needs of the research community
* Ability to provide specification and review as part of software development
* Familiarity with relational databases and relational data concepts
* Fluency with Unix tools for data manipulation
* Ability to clearly and concisely communicate technical, scientific and non-technical information, both verbally and in writing

Nice to Have

* Advanced knowledge of bioinformatics, genomics, and proteomics methods, data structures and formats
* Experience programming with Python, including basic data loading and analysis
* Familiarity with controlled vocabularies and ontologies
* Familiarity with agile software development process in a collaborative setting, e.g. reading and reviewing teammates' code in Github or similar source control
* Experience with data quality assessment

Get job alerts

Create a job alert and receive personalized job recommendations straight to your inbox.

Create alert