Skip to content

Projects

Projects are the primary unit of resource allocation on the platform. Every workspace you launch consumes quota from a project, and every config file or deployment belongs to a project.

Access: All authenticated users. A personal project is automatically created for you when you register.


Project Lifecycle

stateDiagram-v2
    [*] --> Active : Create Project
    Active --> Archived : Admin Archives
    Archived --> Active : Admin Restores
    Active --> [*] : Admin Deletes

Project Page Layout

Projects list view Figure 1: Projects list showing grid view with GPU usage bars per project.

The Projects page supports two view modes:

  • Grid view — cards with project name, description, member count, GPU usage bar
  • List view — compact table with sortable columns

Toggle between views using the icons in the top-right of the toolbar.


Create a Project

  1. Click Projects in the left sidebar.
  2. Click + New Project (top-right button).
  3. Fill in the project form:
Field Description
Name Short identifier (alphanumeric, hyphens allowed)
Display Name Human-readable title shown in the UI
Description Optional — what this project is for
  1. Click Create. The project appears in your list immediately.

Quota is assigned by an admin

A new project starts with zero GPU quota. Contact your admin or submit a Request to have a Resource Plan bound to your project.


Project Detail Tabs

Click a project card to open its detail page. The detail page includes core tabs for Overview, Config, Plan, Builds, GPU Claims, and Resources, plus role-dependent tabs such as Images, Members, and Storage.

Project detail tabs Figure 2: Project detail page with Overview, Config, Plan, Builds, GPU Claims, Resources, and role-dependent tabs.

Tab Contents
Overview Description, quota usage chart, current plan, member list summary
Config Versioned config files (Kubernetes YAML, scripts)
Plan Bound resource plan, schedule window, allowed GPU models, and quota envelope
Builds Project-scoped image builds, archive, storage, or Dockerfile source selection, resource controls, and Kaniko build logs
GPU Claims DRA-based ResourceClaims that can be prepared ahead of time and consume GPU only while bound to live Pods
Resources Live per-user GPU/CPU/memory usage and expandable Kubernetes resource views
Images / Members / Storage Additional tabs shown when your role allows image management, membership management, quota controls, or storage access

Build Project Images

Use the Builds tab when a project needs a container image that is not already in the allowed image list. Project Managers and Project Admins can start builds when image builds are enabled for the project. Project Members can view build status and logs, but cannot start or delete builds.

Source Options

Source When to Use It
Upload archive Upload a .tar.gz or .zip build context. Put the Dockerfile at the archive root.
From Storage Build from a path in your personal storage or a project storage volume you can read.
Dockerfile only Paste a Dockerfile directly. The platform creates a minimal build context.

Build Settings

Field Notes
Image Name Short repository name, for example trainer-api. The final registry path is generated by the platform.
Tag Optional. Leave empty for the platform default, or set a release tag such as v1.0.1.
CPU Cores Build CPU request. It cannot exceed the remaining project build capacity or the platform maximum of 4.
Memory (GB) Build memory request. It cannot exceed the remaining project build capacity or the platform maximum of 16.
Timeout (minutes) Positive integer up to the platform maximum (120 by default). Kubernetes also stops the job at this deadline.

Local disk for Kaniko cache and temporary layers is capped by the platform admin per environment; it is not user-selectable from the build form. Uploaded archive contexts are checked for unsafe paths and expanded-size limits before the builder uses them. DockerHub base images are pulled through the platform Harbor proxy-cache path when it is configured.

After submission, the build history shows status, source type, requested resources, destination image, and live logs. Successful builds are pushed to the Harbor namespace selected by the project's group registry policy; GPU23 Harbor groups build in the dedicated GPU23 lane. The builder or a project manager can delete build artifacts from the build history.

Build access is project-governed

If the start form is disabled, ask a platform admin to enable Image Build Permission for the project and confirm that the project has an active resource plan.

Build from personal storage

The From Storage mode resolves your personal volume from the platform database first, then falls back to the standard user-<name>-storage namespace and user-<name>-disk PVC if the database record is missing but the namespace exists in the cluster. If you still see personal storage is not initialized, open the Storage page once to trigger initialization or ask an admin to re-run user storage init.


GPU Claims (DRA)

The GPU Claims tab lists ResourceClaims that can be prepared for a project ahead of deployment. A claim is pending while it has no live Pod binding; it starts consuming quota and GPU-hours only after Kubernetes DRA allocates it and binds it to a running Pod. Each claim is created through the Kubernetes Dynamic Resource Allocation API and can be selected from the deploy dialog when a Pod or Deployment template uses a deploy-time GPU claim slot annotation.

Create a Claim

Field Notes
Claim Name DNS-safe name used by deployments.
GPU Device Class The model the scheduler reserves, such as rtx5090.gpu.nvidia.com; the form defaults to the first concrete model and lists gpu.nvidia.com (Any) only as an explicit generic placement choice.
GPU Count Physical cards reserved by this claim. Multi-GPU claims remain one claim-scoped allocation.
SM Share Compute slice from 1% to 100%. The effective GPU count is shown beside the form.
VRAM Mode Elastic shares VRAM with peers; Hard cap enforces the selected percentage.
VRAM Share Used only when Hard cap is selected; the platform derives the pinned memory from the smallest matching DRA ResourceSlice memory for the selected GPU Device Class.

A Deployment that uses a reusable claim points every replica at the same ResourceClaim. For example, a 2-GPU rtx5090.gpu.nvidia.com claim remains one shared 2-GPU allocation; platform labels keep dra-gpu-count=2, dra-effective-gpu=2, and dra-gpu-model=rtx5090, while quota and GPU-hours count it once per claim instead of once per replica.

A pending claim has 0 GPU-hours and does not reduce quota. The table shows owner, allocation node, current status (pending, bound, or terminating), the reservedFor count reported by DRA, and a delete action. Delete is allowed for pending claims and rejected while a live Pod is using the claim.

Claim ownership

Project Members can create claims under their own namespace. Project Managers can also list and delete claims belonging to other members of the project.


Plan Window Countdown

When a project's resource plan defines a schedule window, the Plan tab shows a live countdown to the next boundary:

  • Window closes in HH:MM:SS while the window is open.
  • Window opens in HH:MM:SS when the next window is upcoming.
  • Always active if no schedule restriction is defined.

The same countdown surfaces on the workspace launch form and on individual deployments. Pods running in plan-bound queues are subject to the platform's plan-window reaper when the window expires.


Config File Versioning

Config files are stored using Content-Addressable Storage — each version is immutable and identified by a SHA-256 hash.

sequenceDiagram
    participant User
    participant Dashboard
    participant API
    participant CAS as Content Store (SHA-256)

    User->>Dashboard: Upload new K8s YAML
    Dashboard->>API: POST /api/v1/configfiles
    API->>CAS: Store blob by SHA-256
    CAS-->>API: blob_id
    API-->>Dashboard: config_commit_id (immutable)
    Dashboard-->>User: Version created ✓

Working with Config Files

  1. Go to Projects → [your project] → Config tab.
  2. Click + New Config to upload a YAML file or paste content in the editor.
  3. Each save creates a new, immutable version — old versions are never overwritten.
  4. Click the History icon next to a config to view all versions.
  5. To deploy a specific version, click Deploy on the version row.
  6. In the deploy dialog, pick the scheduling queue. If the YAML contains GPU claim slots, choose one of your reserved GPU claims for each slot for this deploy only.

DRA GPU Requests vs. Reusable GPU Claims

  • Template GPU requests use DRA resource keys in the YAML container request, for example nvidia.com/gpu-0: "1" or nvidia.com/rtx6000: "1", plus optional SM percentage and GPU model metadata. The platform creates the runtime DRA claim during deployment.
  • Reusable GPU claims are standalone claims created from the GPU Claims tab. They are pending until a live Pod uses them, then quota and resource-hours count the bound effective GPU once per claim, even when a Deployment has multiple replicas. In the Resource Wizard, choose Pick at deploy and enter a Deploy slot name such as train-a; this is a label shown in the deploy dialog, not the claim name itself. In raw YAML, put platform-go/dra-claim-name: '{{ gpuClaimName "train-a" }}' in a Pod metadata annotation or a Deployment pod-template annotation. The deploying user maps each slot to one of their own claims in the deploy dialog.
  • {{gpuClaimName}} is still supported as the default slot for older config files. Multiple Pods or Deployments can use the same slot to intentionally share the same claim, while different resources can use different slots and therefore different claims.
  • Kubernetes Deployments use one PodTemplate for every replica. If only some replicas should use a claim, split the workload into multiple Deployments, for example one claimed Deployment and one unclaimed Deployment.
  • Services, ConfigMaps, and unannotated workloads are not injected with GPU claims.
  • Do not put {{gpuClaimName}} in spec.resourceClaims[].resourceClaimName or resourceClaimTemplateName; those fields are reserved for platform injection.

Config files require Project Manager role

Only project members with Manager or higher role can create or edit config files.


Managing Members

  1. Open the project and go to the Members tab.
  2. Click + Add Members (requires Project Admin role).
  3. Search for users by username or email. The add-member page loads project context, current members, and a paginated user list in one request.
  4. Select a role for each user:
Role What they can do
User Launch workspaces, view configs
Manager Edit configs, manage deployments
Admin Add/remove members, set member quotas, delete project
  1. Click Confirm to save.

Project Admins can also select multiple rows in the Members tab and batch-remove users, batch-change roles, or batch-set per-user quotas. Project Managers and Project Admins can view member quotas; Project Users can view only their own quota. A per-user quota of 0 for GPU, CPU, or memory means that resource is unlimited for that member, still bounded by the project plan.

Group-inherited members in batch actions. Per-user quota is stored at the project level, so group-inherited members are selectable in the Members table and can be included in a batch-set quota action. The batch-remove and batch-role actions are disabled whenever the selection contains a group-inherited member, because their membership and role come from the owning group and must be changed there. A Group Admin viewing a project owned by their group is treated as an effective Project Admin (see Group Inheritance and Effective Project Role below), so they will see the selection checkboxes and can drive batch quota updates without first being added as a direct project member.

Removing a direct member also removes that member's project namespace, project storage permission overrides, and per-user quota. Group-inherited members must be removed from the owning group instead.

Group Inheritance and Effective Project Role

When a project is owned by a group, the platform resolves each caller's effective project role as follows:

  1. If the user has an explicit project member record, that role wins (admin / manager / user).
  2. Otherwise, for personal projects, the personal owner is treated as admin.
  3. Otherwise, the user's group role in the owning group is used as the effective project role. A Group Admin therefore acts as a Project Admin on every project owned by that group, even without an explicit project_members row.

This is why a Group Admin (e.g. u26000007 in G260000105) can call PUT /api/v1/projects/{projectId}/members/quotas against a group-owned project (e.g. P26000116) and see the batch selection UI in the Members tab without being added as a direct member.


Resources Tab

The Resources tab summarizes live usage by project member:

Column Meaning
User Project member or namespace owner
GPU Effective DRA GPU units currently used
CPU Requested or observed CPU cores
Memory Memory in MiB
Pods Number of active pods in that member namespace

Project Admins, Project Managers, and platform Admin/Manager roles see all project members. Regular Project Users see only their own resource row. Expand a row to inspect pods and Kubernetes resources, or terminate resources you are allowed to manage.

For Pod rows, hover, focus, or tap the status badge to inspect recent Kubernetes events. This is useful when a Pod appears to be creating but Kubernetes is actually reporting image pull errors such as ImagePullBackOff or ErrImagePull.


GPU Quota Overview

The Overview tab shows a quota bar for each resource type:

Resource What It Tracks
GPU Effective DRA GPU units consumed vs. plan limit
CPU CPU cores consumed by running workspaces
Memory RAM consumed by running workspaces

If the GPU bar is full (100%), new workspace launches will queue and wait for resources to free up.


Common Questions

I don't see a + New Project button.

The button is visible to all authenticated users. If it's missing, try a hard refresh (Ctrl+Shift+R). If the problem persists, contact your admin.

How do I view old config file versions?

On the Config tab, click the clock icon (History) next to any config file to see all past versions with timestamps and SHA-256 hashes.

My project quota is 0 GPUs. How do I get more?

Submit a Resource Request via Requests. An admin will review and bind a Resource Plan to your project.

What's the difference between a workspace's GPU and a GPU Claim?

A workspace launch creates an inline ResourceClaim that lives with the pod. A GPU Claim is a standalone ResourceClaim that survives across deployments — useful when several jobs need the same fractional GPU and you want to avoid rescheduling churn.

Why is the Deploy button disabled?

Either the project has no active plan bound, or the plan window is currently closed and only the default queue is selectable. Check the Plan tab for the next opening time.