Quantitative Evaluation of Foundation Models

Python framework for evaluating Foundation Models (FM).

Documentation

Latest: https://github.com/nasa-nccs-hpda/qefm-core/blob/main/README.md

qefm-core

This framework consists of a container that hosts the dependencies required for an extendable collection of Models. Snapshots of model source code are also captured along with supporting inference scripts. Configuration files that specify runtime parameters such as data paths and model tunings are also included. Example runs illustrate how to invoke scripts that execute model inferences.

NOTE: The initial version of this project is deployed with restrictions:

The container can be deployed on any platform with Singularity or Docker; however, associated model checkpoints and statistics file are not included.
In order to run the canned Python/Bash scripts, the user must log into the Discover cluster and execute the runtime scripts described below.
All paths reflect a static Discover enviroment, referencing both fully-specified and relative paths to the input data.
To change default parameters, a copy of the runtime scripts should be made by the user and modified accordingly.
Scripts and configuration files, which are hard-coded with parameters that invoke a very specific Discover invocation, have typically originated in the separate Model projects and tweaked to run in this environment.
Each FM is entirely independent and has a unique runtime signature.
Output formats vary across FMs.
Runtime assistance can be nominally supported by the development team, but FM model architecture expertise is not provided.

Objectives

Library to process FMs using GPU and CPU parallelization.
Machine Learning and Deep Learning inference applications.
Example scripts for a quick AI/ML start with your own data.

Contributors

Glenn Tamkin: glenn.s.tamkin@nasa.gov
Jian Li: jian.li@nasa.gov
Jordan Alexis Caraballo-Vega: jordan.a.caraballo-vega@nasa.gov

User Guide

This User Guide reflects instructions for running inference scripts on Discover only.

Running QEFM Foundation Models Inference scripts

Allocate a GPU before running the inference scripts:

GPU Allocation (CLI)

salloc --gres=gpu:1 --mem=60G --time=1:00:00 --partition=gpu_a100 --constraint=rome --ntasks-per-node=1 --cpus-per-task=10

Command-Line Interface (CLI)

To run each of the Foundation Model tasks with qefm-core, change directories to the qefm-core root directory and run the inference script:

module load singularity
cd <Root directory>
./tests/fm-inference.sh <Container name> <Foundation Model name>

To run a specific Foundation Model task with qefm-core, use the following command:

./tests/fm-inference.sh <Container name> <Foundation model name>

Common CLI Arguments

Command-line-argument	Description	Required/Optional/Flag	Default	Example
`<Root directory>`	Path fo qefm-core installation	Required	N/A	`/discover/nobackup/projects/QEFM/qefm-core`
`<Container name>`	Name of Singularity container image (or sandbox)	Required	N/A	`qefm-core-all.sif`
`<Foundation Model name>`	Short title of Foundation Model	Required	N/A	`gencast`,`aifs`,`aurora`, `fourcastnet`, `graphcast`, `pangu`, `privthi`, `sfno`

Examples

Navigate to Root directory on Discover:

cd /discover/nobackup/projects/QEFM/qefm-core

Run Inference for GenCast Foundation Model:

./tests/fm-inference.sh qefm-core-all.sif   gencast

Run Inference for AIFS Foundation Model:

./tests/fm-inference.sh qefm-core-all.sif   aifs

Run Inference for Aurora Foundation Model:

./tests/fm-inference.sh qefm-core-all.sif  aurora

Run Inference for Fourcastnet Foundation Model:

./tests/fm-inference.sh qefm-core-all.sif  fourcastnet

Run Inference for Pangu Foundation Model:

./tests/fm-inference.sh qefm-core-all.sif  pangu

Run Inference for Privthi Foundation Model:

./tests/fm-inference.sh qefm-core-all.sif  privthi

Run Inference for GraphCast Foundation Model:

./tests/fm-inference.sh qefm-core-all.sif  graphcast

Run Inference for SFNO Supported Foundation Model:

./tests/fm-inference.sh qefm-core-all.sif   sfno

Run Inference for All Foundation Models:

./tests/fm-inference.sh qefm-core-all.sif  ensemble

Runtime Notes:

Since Singularity caches the container when invoked, it is important to specify the location of this cache to avoid disk space limitations. If running on Discover, /lscratch is a handy spot to create a directory path to use as a cache. See the example below for setting the appropriate environment variables:

export APPTAINER_TMPDIR=/lscratch/tdirs/gt-scratch/.cache
export APPTAINER_CACHEDIR=/lscratch/tdirs/gt-scratch/.cache

Name		Name	Last commit message	Last commit date
Latest commit History 353 Commits
.github		.github
data		data
docs		docs
notebooks		notebooks
qefm		qefm
requirements		requirements
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
INSTALL.md		INSTALL.md
LICENSE		LICENSE
README.md		README.md
packages.txt		packages.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Quantitative Evaluation of Foundation Models

Documentation

qefm-core

Objectives

Contributors

User Guide

Running QEFM Foundation Models Inference scripts

GPU Allocation (CLI)

Command-Line Interface (CLI)

Common CLI Arguments

Examples

Runtime Notes:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

nasa-nccs-hpda/qefm-core

Folders and files

Latest commit

History

Repository files navigation

Quantitative Evaluation of Foundation Models

Documentation

qefm-core

Objectives

Contributors

User Guide

Running QEFM Foundation Models Inference scripts

GPU Allocation (CLI)

Command-Line Interface (CLI)

Common CLI Arguments

Examples

Runtime Notes:

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages