How to Use Private Data Source¶
To enable users to fetch custom data from cloud we expose basic tooling at their disposal. In this document you will learn how to fetch the data at bulk or per file.
Important
Whenever you use a CLI from the Noether framework - your data stays on your local machine!
Emmi AI doesn’t store or collect your data.
Interface and Examples¶
noether.io CLI supports two types of executions:
noether-data <SERVICE_NAME> <COMMAND>— specify a service name (each has unique commands).noether-data <COMMAND>— run a global command.
Hugging Face¶
noether-data huggingface estimate EmmiAI/AB-UPT
noether-data huggingface ext EmmiAI/NeuralDEM .th ~/data --type model --manifest-out manifest.json
The ext command downloads all .th files from EmmiAI/NeuralDEM into ~/data.
The --manifest-out option writes a manifest for integrity checks.
AWS¶
noether-data aws estimate noaa-goes16 ABI-L1b-RadC/2023/001/00/
noether-data aws fetch my-bucket data/prefix/ ./data --extension .parquet --manifest-out s3-manifest.json
The fetch command downloads only .parquet files into ./data, while creating a manifest file.
Verification¶
Verification determines whether files are complete. If manifest.json exists, corrupted or missing files can be
redownloaded:
noether-data verification check -r ./data -m manifest.json --action redownload
If no manifest exists, create one with:
noether-data verification build -r ./data -m manifest.json
To explore all options, use the --help flag.