The NSF has funded the SCAILFIN proposal. Kyle Cranmer is a PI of this 2-year grant, which will fund Irina Espejo Morales, a PhD student in the Center for Data Science.
The main goal of the SCAILFIN project is to deploy artificial intelligence and likelihood-free inference (LFI) techniques and software using scalable cyberinfrastructure (CI) that is developed to be integrated into existing CI elements, such as the REANA system. The analysis of LHC data is the project's primary science driver, yet the technology is sufficiently generic to be widely applicable. The LHC experiments generate tens of petabytes of data annually and processing, analyzing, and sharing the data with thousands of physicists around the world is an enormous challenge. To translate the observed data into insights about fundamental physics, the important quantum mechanical processes and response of the detector to them need to be simulated to a high-level of detail and accuracy. Investments in scalable CI that empower scientists to employ ML approaches to overcome the challenges inherent in data-intensive science such as simulation-informed inference will increase the discovery reach of these experiments. The development of the proposed scalable CI components will catalyze convergent research because 1) the abstract LFI problem formulation has already demonstrated itself to be the "lingua franca" for a diverse range of scientific problems; 2) the current tools for many tasks are limited by lack of scalability for data-intensive problems with computationally-intensive simulators; 3) the tools the project is developing are designed to be scalable and immediately deployable on a diverse set of computing resources due to the design; and 4) the integration of additional commonly-used workflow languages to drive the optimization of ML components and to orchestrate large-scale workflows will lower the barrier-to-entry for researchers from other domains.