Which datasets should MAST bring in to maximize Roman science return?
The Mikulski Archive for Space Telescopes (MAST) is planning to host additional public non-Roman datasets that will greatly enhance the scientific output of Roman. By bringing these datasets to MAST, this will enable maximum interoperability through consistent APIs for access and discovery --- reducing barriers to science by enabling streamlined workflows for co-analysis of Roman & these non-Roman synergistic datasets.
MAST is seeking community suggestions regarding possible new datasets for this initiative!
Currently, MAST hosts data from 20+ space and ground-based missions, including the Hubble Space Telescope (HST), the James Webb Space Telescope (JWST), the Transiting Exoplanet Survey Satellite (TESS), and the Panchromatic Survey Telescope & Rapid Response System (Pan-STARRS). For a full list of missions and other datasets, please visit: https://archive.stsci.edu/missions-and-data.
What datasets will be considered?
Non-Roman datasets of any type (e.g., catalogs served either in the form of a database or as a healpix-partitioned dataset, stacked images, reduced spectra, ...), which are publicly available by Sept. 2027, will be considered.
Non-Roman datasets of any type (e.g., catalogs served either in the form of a database or as a healpix-partitioned dataset, stacked images, reduced spectra, ...), which are publicly available by Sept. 2027, will be considered.
Note: Datasets which are hosted at another NASA archive, or are currently cloud-hosted, must have a compelling case on why bringing them to MAST is critical. Such justifications could include details on why the current cloud-hosted access does not support efficient co-analysis with Roman data.
Examples of datasets to be considered: Gaia DR4 (crossmatching, astrometry, proper motions, parallaxes), DESI spectroscopic redshift catalogs (calibrate/validate photometric redshifts), GALEX catalogs in healpix-partitioned HATS format (crossmatching, UV coverage)
Examples of datasets not appropriate for this effort: Rubin (not publicly available on initiative timescale), Euclid (hosted at IPAC/IRSA)
Examples of datasets to be considered: Gaia DR4 (crossmatching, astrometry, proper motions, parallaxes), DESI spectroscopic redshift catalogs (calibrate/validate photometric redshifts), GALEX catalogs in healpix-partitioned HATS format (crossmatching, UV coverage)
Examples of datasets not appropriate for this effort: Rubin (not publicly available on initiative timescale), Euclid (hosted at IPAC/IRSA)
Dataset selection rubric: Final selections will be made by MAST and Roman leadership, considering the following aspects:
- Impact to Roman science
- Cost given available resources
- Ensure balance of which surveys and science cases are facilitated
For this community suggestion form, we are seeking:
- A summary of the dataset, its scientific impact, and use-case (approx 1-3 paragraphs) detailing why bringing the suggested dataset to MAST is important, and how it will impact Roman science.
- Please only include one dataset per submission. Multiple submissions are welcome, if you would like to suggest more than one dataset!
- If available, other specific details regarding the suggested dataset (which will help to facilitate the evaluation process).
Deadline for suggestions: 31 March 2026
Selected datasets are expected to be announced by Roman launch.
Selected datasets are expected to be announced by Roman launch.
Questions regarding this questionnaire and initiative can be directed to the MAST Help Desk at archive@stsci.edu.