Introduction to Noether Framework ================================= Main Components --------------- The **Noether Framework** is organized into the following submodules: - ``core`` - low-level components responsible for the heavy-lifting of the framework - ``data`` - data serving utilities (e.g., datasets, preprocessors, utils, etc.) - ``io`` - data fetching and storage utilities - ``inference`` - utilities for CLI/tooling to run inference pipeline - ``modeling`` - model building blocks and some of the SOTA architectures - ``training`` - trainers, callbacks, and CLI/tooling to run the training pipeline .. figure:: /_static/noether_architecture.png :alt: Noether Architecture :align: center :width: 600px Core layering of the submodules How Do They Interact -------------------- The interaction between existing modules is better described using a typical workflow. Let's say, a user wants to train a model, like `AB-UPT `_, to do so they have two options: 1. Use configuration files to set up an experiment 2. Use code and tailor it to specific needs In either case, the same underlying shared codebase is used to ensure consistent behavior. Our main buildings blocks are located in ``core``. It takes care of things like object factories, callbacks, trackers, schemas, etc. All of which have **Base** classes that can be used as abstract classes to create custom variations, as well as ready-to-use implementations with clearly defined usage patterns. Those are usually located next to their typical application, e.g. training callbacks will be in the ``training`` submodule, and so on. To account for various levels of expertise (e.g. a seasoned ML engineer, a MSc/PhD student, a CFD expert, etc.) we provide multiple abstraction levels. The higher-level modules, like ``data``, ``modeling``, etc., give a list of convenient and frequently used blocks to get things going. They fully rely on ``core`` and are ready to be extended with some custom logic when necessary. In most cases it is recommended to extend those modules first rather than diving directly into the ``core`` itself. For example, both ``inference`` and ``training`` rely on the ``modeling`` submodule as it provides the architectures for model initialization. ``io`` on the other hand is pretty much standalone and mainly depends on the third-party packages. Currently it supports data fetching and validation from HuggingFace and AWS S3. By sharing feedback about your preferred way of storing and accessing data, you can help us prioritize future features.