Data Lake for experimental data.
- Azure Blobs storage for files.
- Azure CosmosDB for metadata.
- Azure Batch for post-processing the data.
- PAYG - serverless computing.
When you run a physical experiment, a simulation, an ML training you generate precious data.
- Use an Azure Blobs container to store all the data in a project.
- For each run keep all files together, in a "Run".
- Keep run metadata in a JSON file.
The output of an experiment, a simulation, an ML training is usually a file or a set of files. To reproduce the experiment, the simulation, the training, you usually need same input files, same input parameters.
Run is a set of files that include input files, output files and a JSON file with metadata. The latter captures input parameters, environment and annotations.
Run is a unit of operation for Project Exekias tools and services.
Create configuration file using interactive exekias command.
exekias config create
The config file contains addresses of the deployed blobs container and the CosmosDB database linked to the container.
Create a uniquely named folder like 20230609-142856-title with a JSON file params.json. The first sequence of digits in the folder name is the run start date,
the second sequence of digits is the run start time and the last portion is the title. These three parts will become searchable run metadata along with
the content of the params.json file.
Upload the folder to the to the Azure Blobs container.
-
exekiascommand:exekias data upload <folder>
-
exekiascommand:exekias query "run.params.description='sample'"This lists all runs that have
{"description": "sample"}in theirparams.jsonfile. The filter expression is any Azure Cosmos DB SQL scalar expression.
-
Download a folder with Azure Storage Explorer.
-
Download a folder with AzCopy, SDKs andother tools.
-
exekiascommand:exekias data download <run_id> <folder>
You will need to have Contributor role in an Azure subscription.
exekias backend deploy
The interactive command may create and connect metadata services to an existing blob container or create a new blob container for you.
The GitHub codespace created from the repository has all the necessary tools set up.
- Set up Visual Studio Code with Dev Containers as described in the documentation.
- Start VS Code and run "Dev Containers: Clone Repository in Container Volume..." from the Command Palette.
- Enter
microsoft/exekiasas a GitURI.
The container will have all the necessary tools installed.
See Open a Git repository or GitHub PR in an isolated container volume for details.
- Install
dotnethttps://dotnet.microsoft.com SDK 6.0 or later. - Install
bicephttps://aka.ms/bicep CLI. - Set
BicepPathenvironment variable to point to thebicepexecutable - Install
PowerShellhttps://aka.ms/powershell with moduleAz. - Install azurite, Azure Storage emulator, https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azurite#install-azurite
- Install
NetCDFhttps://www.unidata.ucar.edu/software/netcdf/. On Ubuntu, package name islibnetcdf-dev. - Copy netCDF Windows .dll files and set LIBNETCDFPATH environment variable to the path of
netcdf.dll.- On Windows, download and install latest Windows netCDF NC4 x64 installer package from https://docs.unidata.ucar.edu/netcdf-c/current/winbin.html
- Compress all the
.dllfiles, e.g.Compress-Archive "C:\Program Files\netCDF 4.9.2\bin\*.dll" netcdf-win.zip - Copy the archive to the Linux machine and decompress it, e.g
mkdir -p ~/netcdf/bin && unzip netcdf-win.zip -d ~/netcdf/bin
-
Run azurite, preferably in a separate terminal
cd ~/azurite azurite-blob -
Run
dotnet testcommands.dotnet test src/Exekias.Core.Tests/ dotnet test src/Exekias.SDS.Tests/ dotnet test src/Exekias.AzureStorageEmulator.Tests/
-
Run the
dotnet publishcommand, e.g.dotnet publish src/exekias -
Test the command is running
./src/exekias/bin/Release/net8.0/publish/exekias -h
-
Login to Azure with your account
pwsh -c Connect-AzAccount -UseDeviceAuthentication -
Deploy backend instance and create configuration file. You will need owner role in the resource group for the deployment to succeed.
./src/exekias/bin/Release/net8.0/publish/exekias backend deploy ./src/exekias/bin/Release/net8.0/publish/exekias config create -
Start the canary test
.\integration-test.ps1 <resource_group> <storage_account_name> ./src/exekias/bin/Release/net8.0/publish/exekias