Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Research Software

Overview

GAIA HazLab leverages a comprehensive ecosystem of research software packages spanning data I/O, AI/ML frameworks, visualization tools, and domain-specific research packages. This page catalogs the key software tools we use and develop to build reproducible, scalable workflows for multi-hazard assessment.

Our goal is to develop a Research Software Agent that integrates these tools into cohesive workflows, enabling rapid prototyping, experimentation, and deployment of AI-driven hazard models.


Core Data I/O Packages

Essential packages for reading, writing, and processing geophysical and environmental data.

ObsPy

Description: The seismological observatory in Python Use Cases: Reading seismic waveforms (mseed, SAC), earthquake catalogs, station metadata Website: https://obspy.org/ Key Features:

EarthScope SDK

Description: Python SDK for accessing EarthScope (formerly IRIS) seismic data services Use Cases: Downloading waveforms, station metadata, earthquake catalogs Website: https://earthscope.org/ Key Features:

xarray

Description: N-dimensional labeled arrays and datasets in Python Use Cases: Climate data, weather forecasts, multi-dimensional gridded data Website: https://xarray.dev/ Key Features:

Rasterio

Description: Geospatial raster I/O for Python Use Cases: Reading satellite imagery, DEMs, landslide susceptibility maps Website: https://rasterio.readthedocs.io/ Key Features:

GeoPandas

Description: Geographic data manipulation in Python Use Cases: Vector data (fault lines, flood zones, landslide inventories) Website: https://geopandas.org/ Key Features:

Pandas

Description: Data analysis and manipulation library Use Cases: Tabular data, time series, catalogs, stream gauge data Website: https://pandas.pydata.org/ Key Features:


AI/ML Software Packages

Frameworks for building, training, and deploying machine learning models.

PyTorch

Description: Deep learning framework Use Cases: Neural networks for earthquake detection, AR forecasting, landslide classification Website: https://pytorch.org/ Key Features:

PyTorch Lightning

Description: High-level PyTorch interface for research Use Cases: Standardized training loops, multi-GPU training Website: https://lightning.ai/ Key Features:

Hugging Face Transformers

Description: State-of-the-art transformer models Use Cases: Foundation models for spatiotemporal forecasting (ACE2) Website: https://huggingface.co/transformers/ Key Features:

Collaboration: GAIA HazLab uses ACE2 for atmospheric river forecasting and extreme weather prediction

scikit-learn

Description: Machine learning library for Python Use Cases: Random forests, SVMs, clustering for landslide susceptibility Website: https://scikit-learn.org/ Key Features:


Visualization Packages

Tools for creating publication-quality figures, interactive plots, and geospatial visualizations.

PyGMT

Description: Python interface to the Generic Mapping Tools Use Cases: Publication-quality maps, cross-sections, 3D visualizations Website: https://www.pygmt.org/ Key Features:

Plotly

Description: Interactive graphing library Use Cases: Interactive dashboards, time series exploration, 3D visualizations Website: https://plotly.com/python/ Key Features:

Matplotlib

Description: Comprehensive plotting library Use Cases: Static plots, subplots, waveform displays Website: https://matplotlib.org/ Key Features:

Seaborn

Description: Statistical data visualization Use Cases: Distributions, correlations, statistical relationships Website: https://seaborn.pydata.org/ Key Features:

Holoviews & Panel

Description: Interactive data visualization and dashboards Use Cases: Exploratory data analysis, interactive parameter tuning Website: https://holoviews.org/, https://panel.holoviz.org/ Key Features:


Research Core Packages

Domain-specific packages developed by the research community for specialized workflows.

NoisePy

Description: Python package for seismic ambient noise cross-correlation Use Cases: Ambient noise tomography, seismic velocity monitoring Website: https://github.com/noisepy/NoisePy Key Features:

Developed by: Denolle Lab (Marine Denolle, Yiyu Ni, et al.)

SeisBench

Description: Benchmark suite for seismological machine learning Use Cases: Earthquake detection, phase picking, magnitude estimation Website: https://github.com/seisbench/seisbench Key Features:

Used in: Strong earthquake detection models (GAIA HazLab)

PySINDy

Description: Sparse Identification of Nonlinear Dynamics Use Cases: Data-driven modeling, reduced-order models, surrogate modeling Website: https://github.com/dynamicslab/pysindy Key Features:

Used in: Reduced-order modeling for landslide dynamics, debris flow

PySHRED

Description: Python package for Shallow Recurrent Decoder (SHRED) models Use Cases: Spatiotemporal forecasting, sensor placement optimization Website: https://github.com/shervinsahba/pyshred Key Features:

Used in: Multi-sensor fusion for hydromechanical modeling

Landlab

Description: Python toolkit for numerical modeling of Earth surface dynamics Use Cases: Landscape evolution, debris flow, erosion modeling, landslide dynamics Website: https://landlab.github.io/ Key Features:

Used in: Data-driven reduced-order modeling for landscape evolution, debris flow modeling, and landslide susceptibility analysis

ESMValTool

Description: Earth System Model Evaluation Tool Use Cases: Climate model evaluation, multi-model analysis Website: https://www.esmvaltool.org/ Key Features:

rslearn (AI2)

Description: Remote sensing machine learning library from Allen Institute for AI Use Cases: SAR processing, change detection, multi-temporal analysis, landslide detection Website: https://github.com/allenai/rslearn Key Features:

Collaboration: GAIA HazLab is collaborating with AI2 on landslide detection using rslearn to fine-tune foundation models on SAR imagery and multi-sensor networks (Akash Kharita, Scott Henderson)


Development & Deployment Tools

Supporting tools for research software development, version control, and deployment.

Jupyter & JupyterLab

Description: Interactive computing environment Use Cases: Exploratory analysis, tutorials, reproducible research Website: https://jupyter.org/

Conda & Mamba

Description: Package and environment management Use Cases: Creating reproducible environments, dependency management Website: https://conda.io/, https://mamba.readthedocs.io/

Docker & Singularity

Description: Containerization platforms Use Cases: Reproducible computing environments, HPC deployment Website: https://docker.com/, https://sylabs.io/singularity/

Dask

Description: Parallel computing library Use Cases: Large dataset processing, distributed computing Website: https://dask.org/

MLflow

Description: Machine learning lifecycle management Use Cases: Experiment tracking, model registry, deployment Website: https://mlflow.org/


Research Software Agent Vision

Our vision is to develop an integrated Research Software Agent that:

  1. Automates Workflows: Chain together data loading, preprocessing, model training, and visualization

  2. Enables Reproducibility: Track dependencies, versions, and compute environments

  3. Facilitates Discovery: Search and recommend appropriate tools for specific tasks

  4. Provides Templates: Offer ready-to-use templates for common workflows

  5. Integrates AI: Use LLMs to generate code, debug issues, and optimize pipelines


Contributing

We welcome contributions of:

See our Contributing Guide for details.


Resources

Package Ecosystems

Learning Resources


Future Development

We are actively working on: