Job Information
The University of Chicago Data Scientist - JR28692-3800 in Chicago, Illinois
This job was posted by https://illinoisjoblink.illinois.gov : For more information, please see: https://illinoisjoblink.illinois.gov/jobs/12406565 Department
PSD Computer Science: Administration and Staff
About the Department
Are you passionate about leveraging your skills to combat global health threats? The University of Chicago (UC) invites applications from an experienced data scientist to help develop predictive models for aid in the development of vaccines for the for the Coalition for Epidemic Preparedness and Innovation (CEPI) Disease X program, aimed at preventing the spread of emerging viruses.
Led by Professor Rick Stevens from the Department of Computer Science, our project focuses on constructing a comprehensive knowledge base of data and AI models crucial for vaccine development against emerging viruses. Researchers with diverse backgrounds from UChicago and Argonne National Laboratory will be partnering with cross-institution collaborators at Houston Methodist, UT Austin, UTMB, JCVI, and LJI.
Job Summary
The University of Chicago is seeking a highly motivated data scientist to join a groundbreaking initiative aimed at preparing for future pandemics. Though the recent pandemic has eased, the next one can strike at any time. In collaboration with global partners and the Coalition for Epidemic Preparedness Innovations\' (CEPI) Disease X program, our mission is to develop tools and technologies capable of delivering a vaccine within 100 days of identifying any emerging virus.
As a key member of our team, you will design data pipelines, develop artificial intelligence(AI)-based models, and curate scientific datasets to combat public health threats worldwide. If you are passionate about using AI and computational science to solve real-world challenges, we want to hear from you!
Responsibilities
- Collect, curate, and analyze biological datasets, including sequence/structure data and scientific literature.
- Build and deploy ML/AI models-particularly large language models (LLMs)-to analyze multi-modal data from biological and textual sources.
- Work closely with computational biologists, software engineers, and epidemiologists to drive innovative solutions.
- Present findings to internal stakeholders, global collaborators, and scientific audiences.
- Contribute to high-impact publications, conferences, and scientific journals.
- Has a deep understanding of methods to analyze complex data sets for the purpose of extracting and purposefully using applicable information. May develop and maintain infrastructure that connects data sets.
- Guides staff or faculty members in defining the project and applies principals of data science in manipulation, statistical applications, programming, analysis and modeling.
- Calibrates data between large and complex research and administrative datasets. Guides and may set the operational protocols for collecting and analyzing information from the University\'s various internal data systems as well as from external sources.
- Designs and evaluates statistical models and reproducible data processing pipelines using expertise of best practices in machine learning and statistical inference. Provides expertise for high level or complex data-related requests and engages other IT resources as needed. Partners with other campus teams to assist faculty with data science related needs.
- Performs other related work as needed.
Minimum Qualifications
Education:
Minimum requirements include a college or university degree in related field.
Work Experience:
Minimum requirements include knowledge and skills developed through 5-7 years of work experience in a related job discipline.
Certifications:
---
Preferred Qualifications
Education:
- Master\'s r higher degree in computer science, engineering, or a related field.
Experience:
- Experience with genomic data, protein structures, or other biological datasets.
- Experience in developing or applying ML models (experience with LLMs is a plus).
Technical Skills or Knowledge:
- Familiarity with data modeling and generation for AI training.
- Knowledge of vector databases is a plus.
Preferred Competencies
- Ability to design reliable ingest mechanisms.
- Ability to collaborate with teams across different scientific domains.
- Strong written and verbal skills for scientific presentations and publications.
- Ability to work both independently and collaboratively in a team environment.
- Background understanding or coursework in the biological sciences and related methodologies is a plus.
Application Documents
- Resume/CV (required)
- Cover Letter (required)
- References Contact Information (3)(required)
When applying, the document(s) MUST