Register now After registration you will be able to apply for this opportunity online.
Modeling Contextual Data to Reduce Outcome Variation: Analysis of the StudentLife Dataset
This project aims to systematically explore how "context" can be conceptualized and used to model outcomes in clinical research. This project focuses on analyzing the StudentLife dataset to explore how capturing contextual data can reduce variation in key outcome measures, such as the PHQ-9 (a measure of depression). The project aims to model how contextual covariates can explain variation over time, particularly during periods of increasing stress, such as a college semester. The findings will contribute to a viewpoint paper and a detailed analysis of the impact of contextual data on reducing trial sizes in research.
Context (i.e. the surroundings and setting in which clinical evidence is measured) is critical in influencing how outcomes are interpreted. In longitudinal clinical studies, variability in primary outcomes often complicates data interpretation and increases the required sample size. By incorporating covariates summarizing contextual data, we believe it's possible to account for factors that influence variation in outcomes, potentially reducing the required sample size and improving the accuracy of models. The StudentLife dataset, which includes a range of contextual data collected over a college semester, offers a unique opportunity to investigate this potential.
The ideal candidate would have or be interested to learn the following skillsets:
- Data Analysis: Proficiency in analyzing large datasets, particularly using R or Python (StudentLife dataset is available in RData format).
- Biostatistics/Modeling: Experience with modeling longitudinal data and understanding the role of covariates.
- Research and Writing: Ability to develop and articulate a strong viewpoint and conduct thorough literature reviews.
- Ontology Design: Familiarity with ontology or system development for data integration.
- Data and sensor technologies: Knowledge of sensors, data sources, and their application in clinical outcome modeling.
Context (i.e. the surroundings and setting in which clinical evidence is measured) is critical in influencing how outcomes are interpreted. In longitudinal clinical studies, variability in primary outcomes often complicates data interpretation and increases the required sample size. By incorporating covariates summarizing contextual data, we believe it's possible to account for factors that influence variation in outcomes, potentially reducing the required sample size and improving the accuracy of models. The StudentLife dataset, which includes a range of contextual data collected over a college semester, offers a unique opportunity to investigate this potential.
The ideal candidate would have or be interested to learn the following skillsets:
- Data Analysis: Proficiency in analyzing large datasets, particularly using R or Python (StudentLife dataset is available in RData format).
- Biostatistics/Modeling: Experience with modeling longitudinal data and understanding the role of covariates.
- Research and Writing: Ability to develop and articulate a strong viewpoint and conduct thorough literature reviews.
- Ontology Design: Familiarity with ontology or system development for data integration.
- Data and sensor technologies: Knowledge of sensors, data sources, and their application in clinical outcome modeling.
The project will be divided into the following tasks:
_**Dataset Analysis:**_
- StudentLife Dataset: Analyze the StudentLife dataset (https://zenodo.org/records/3529253), which includes self-report questionnaires (such as PHQ-9), activity data, audio, Bluetooth encounters, conversations, light exposure, GPS coordinates, phone usage (screen on/off, charge status), and Wi-Fi IDs from 48 participants over 66 days.
- Contextual Covariates: Identify and model contextual covariates that could minimize the variation in the PHQ-9 scores over time, particularly considering the natural trend in PHQ-9 scores as student stress increases throughout the semester.
**_Viewpoint Development:_**
- Trial Size Reduction Model: Develop a model to demonstrate how much trial sizes could be reduced if 1%, 5%, 10%, etc., of the variation in the primary outcome (e.g., PHQ-9) could be explained by incorporating contextual data.
- Viewpoint Paper: Prepare a viewpoint paper discussing the potential benefits of capturing and modeling contextual data in reducing outcome variability and improving study designs.
**_Modeling and System Integration:_**
- Improved Modeling: Use the identified contextual covariates to improve the modeling of anchor outcomes (e.g., PHQ-9) over time. Assess how well these covariates explain variations and trends in the data.
- Ontology/System Development: Develop an initial ontology or system to summarize and integrate contextual data with primary outcomes. This will be informed by a literature review of existing approaches.
The project is expected to result in the following outputs:
_**Viewpoint Paper:**_
- A paper presenting the case for capturing contextual data to reduce variation in primary outcomes and the potential to reduce trial sizes.
**_Modeling Paper:_**
- A paper focused on improved modeling of anchor outcomes (e.g., PHQ-9) using longitudinal contextual covariates identified from the StudentLife dataset.
**_Ontology/System:_**
- An initial ontology or system designed to integrate contextual data into outcome modeling, informed by a literature review of existing approaches.
The project will be divided into the following tasks:
_**Dataset Analysis:**_
- StudentLife Dataset: Analyze the StudentLife dataset (https://zenodo.org/records/3529253), which includes self-report questionnaires (such as PHQ-9), activity data, audio, Bluetooth encounters, conversations, light exposure, GPS coordinates, phone usage (screen on/off, charge status), and Wi-Fi IDs from 48 participants over 66 days.
- Contextual Covariates: Identify and model contextual covariates that could minimize the variation in the PHQ-9 scores over time, particularly considering the natural trend in PHQ-9 scores as student stress increases throughout the semester.
**_Viewpoint Development:_**
- Trial Size Reduction Model: Develop a model to demonstrate how much trial sizes could be reduced if 1%, 5%, 10%, etc., of the variation in the primary outcome (e.g., PHQ-9) could be explained by incorporating contextual data.
- Viewpoint Paper: Prepare a viewpoint paper discussing the potential benefits of capturing and modeling contextual data in reducing outcome variability and improving study designs.
**_Modeling and System Integration:_**
- Improved Modeling: Use the identified contextual covariates to improve the modeling of anchor outcomes (e.g., PHQ-9) over time. Assess how well these covariates explain variations and trends in the data.
- Ontology/System Development: Develop an initial ontology or system to summarize and integrate contextual data with primary outcomes. This will be informed by a literature review of existing approaches.
The project is expected to result in the following outputs:
_**Viewpoint Paper:**_
- A paper presenting the case for capturing contextual data to reduce variation in primary outcomes and the potential to reduce trial sizes.
**_Modeling Paper:_**
- A paper focused on improved modeling of anchor outcomes (e.g., PHQ-9) using longitudinal contextual covariates identified from the StudentLife dataset.
**_Ontology/System:_**
- An initial ontology or system designed to integrate contextual data into outcome modeling, informed by a literature review of existing approaches.