Into the Dataverse: A dataset of Superfund geolocations and their implications for health
The UNC SRP Dataverse portal serves as a primary platform for publishing SRP related data across projects and cores. Julia Rager, PhD, co-lead of the Data Management and Analysis Core (DMAC) was instrumental in getting this portal set up.
“The DMAC was the primary driver for identifying how data should be permanently deposited across SRP projects and cores,” Rager explains. “Our team of data scientists evaluated the suitability of many different data repositories and made the decision to focus our efforts on UNC Dataverse due to its flexibility in allowing for the variety of data produced across environmental and biomedical projects.”
According to Rager, scientists within the UNC SRP and their collaborators can organize their own datasets according to examples provided by the DMAC and UNC Dataverse. Once the researchers sharing the data feel confident in how it will be permanently stored and accessed by the public, the DMAC assists researchers with officially publishing their novel datasets with a permanent DOI and full citation.
Eric Brown Jr., a doctoral candidate in UNC SRP Director Rebecca Fry’s lab, led efforts to publish a dataset in the UNC SRP Dataverse this past April. Consisting of 1,877 Superfund National Priorities List sites as designated by the US Environmental Protection Agency, the dataset includes geolocations of deleted, inactive, and active Superfund sites as well as the contaminants present, the contaminant media, a description of the site location, a site hazard assessment, and the years the site has been listed.
The project began as a mutual interest of Brown and Dr. Fry’s in exploring environmental justice concerns related to hazardous waste sites, particularly given the environmental justice movement’s origin in North Carolina. From this interest, the team decided to curate a dataset from the EPA list of Superfund sites and related chemical information.
Current research using the dataset focuses on defining the population demographics around Superfund sites in NC and the Southeast. Brown’s future research plans include using the dataset to explore the prevalence of health outcomes such as cancer, asthma, and other disorders in proximity to Superfund sites.”
Rager and the DMAC team have high aspirations for how the Dataverse and datasets like Brown’s will be leveraged for future research.
“We aim to streamline the clear and efficient sharing of important environmental and biomedical data, ultimately leading to improved dissemination of environmental health findings,” states Rager.