ODIn @ HILDA '24

'Grats to Pratik and Juseung on their #HILDA2024 accept for "Drag, Drop, Merge: A Tool for Streamlining Integration of Longitudinal Survey Instruments", which explores schema integration in longitudinal studies. Longitudinal surveys, and specifically social sciences data collected through survey forms, are a really interesting case of schema integration.

The data being collected is, on the most fundamental level, about only a single class of entity. However, each year brings new knowledge, and new context to the survey, necessitating changes. For example, researchers might learn that the culture of the study population uses different names in different social contexts, necessitating a change to the survey to clarify the social context of the name being recorded. Alternatively, researchers might adapt a choice of phrasing like "how many of your family members live nearby" into "how many people are in your support network" to better address the nuanced situations. Even without changes to the survey itself, changing context can result in changing interpretations of participant answers.
For example, take a multiple-choice question about income levels.
A single answer at the start of a 20-year study may indicate a wildly different socioeconomic status than the exact same answer given in the last year of the study.

The problem of integrating many years of forms is fundamentally similar to data integration, but is in some ways easier (there are few changes between successive years), and in some ways harder (there are many such changes over the lifetime of the survey). Changes are also nuanced, with growing levels of divergence.

The paper lays the groundwork for a tool to help researchers conducting longitudinal studies to prepare their data for publication, and for researchers trying to use this study data to reliably develop derived, 'clean' datasets useful for the needs of their specific study.

Side Note: This paper is the result of a massively interdisciplinary collaboration between CS, Linguistics, Medicine, Stats (and soon-to-be Environmental Health). I'm really excited that we've hit on an opportunity to develop techniques that will benefit such a diverse range of fields of study.


This page last updated 2024-08-27 00:23:18 -0400