Skip to content

Storage

The platform provides two types of persistent storage, both backed by Kubernetes Persistent Volume Claims (PVCs). Storage persists independently of workspaces — your data is safe when a workspace stops or is deleted.

Access: All users have personal storage. Group storage access is managed by group admins.


Storage Types

Personal Storage Group Storage
Owner Your user account A group
Provisioned by Admin (batch init) Admin (on request)
Access You only Group members (permissions vary)
Best for Personal datasets, notebooks, experiments Shared datasets, team collaboration
Managed from Storage → Personal tab Storage → Group tab

Storage List Page

Storage list page Figure 1: Storage page with Personal and Group tabs. Each PVC shows capacity, status, and a Browse button.

The Storage page has two tabs:

  • Personal — your private PVCs
  • Group — PVCs belonging to groups you are a member of

Each PVC row shows: - Name and description - Capacity (e.g., 50Gi) - Status: Available, Bound, Pending - Browse button — open the file browser for this PVC - Permissions button (group storage, for admins)


Storage Lifecycle

flowchart LR
    Request["User Requests Storage\nvia Requests form"] -->|Admin Approves| Provisioned["PVC Created\nin Kubernetes"]
    Provisioned -->|Bound to Project| Available["Available in\nWorkspace Mount"]
    Available -->|Workspace mounts PVC| InUse["Data persists\nacross restarts"]
    InUse -->|Admin deletes PVC| Released["PVC Released\n(data lost)"]

PVC deletion is permanent

When an admin deletes a PVC, all data inside it is permanently destroyed. Always back up important data before requesting PVC removal.


Browsing Storage Files

Click Browse next to any PVC to open the file browser.

Storage file browser Figure 2: Storage file browser showing directory listing with upload, download, and delete capabilities.

The file browser supports: - Navigate directories - Upload files (drag and drop or file picker) - Download files and folders - Create new directories - Delete files (use caution — deletions are immediate)


Using Storage in Workspaces

When launching a new workspace, you can mount one or more PVCs:

  1. In the New Workspace form, find the Storage section.
  2. Click + Add Storage.
  3. Select the PVC from the dropdown.
  4. Set the Mount Path (e.g., /data or /workspace/datasets).
  5. Repeat for additional PVCs.

Mount path convention

Use consistent mount paths across workspaces so your code always finds data at the same location regardless of which workspace is running.


Storage Profiles and Lanes

When admins or group admins provision a new PVC, they pick a Storage Profile that maps to a backing storage class and a lane. Each lane is tuned for a different workload shape.

Profile Backend Lane Best for
JuiceFS RWX JuiceFS over object storage shared-rwx Cross-node datasets and checkpoints visible to many pods
Legacy RWX NFS / legacy share legacy-rwx Long-standing shared data already on the legacy mount
Longhorn (Fast RWO) Longhorn replicated block fast-rwo Single-pod hot scratch — fastest reads and writes
Default Cluster default varies Group admin lets the platform pick

The profile picker shows availability and a recommendation per option; unavailable profiles are greyed out with the reason.


Training Storage Pattern

For model training on gpu1, gpu2, and gpu3, admins can provision group storage on juicefs-gpu23-rwx. This is the preferred shared path for datasets, checkpoints, and final artifacts that need to be visible across nodes.

For checkpoint-heavy jobs, use this layout:

Path Storage Purpose
/datasets JuiceFS group RWX Shared read-mostly training data
/checkpoints JuiceFS group RWX Durable checkpoints and resumable state
/scratch Fast job-local RWO Temporary hot files during one job

Copy only the final results from scratch back to the shared JuiceFS mount. This keeps repeated reads fast through node-local cache while avoiding unnecessary shared-write pressure.


Fast-Stage Cache

Fast-stage transfers stage data from a group RWX volume into a project-local Fast RWO PVC, so a single training job can read at node-local speed without touching the shared filesystem.

  1. On the project's Storage tab, click Data Path next to a bound RWX volume.
  2. Select Fast-stage Cache in the modal.
  3. Fill in the form:
Field Notes
Target Namespace Where the fast PVC will be created (your project namespace by default).
Fast PVC Name DNS-safe name, defaults to fast-<source-pvc>.
Fast Capacity (Gi) Must be at least the source PVC capacity.
Checksum Mode Size + mtime is the default; choose SHA-256 for strict verification or None to skip checks.
  1. Click Start Fast Stage. The platform launches a one-off Job that copies the data and reports status.

Sync results back

Fast-stage is intended as a hot read cache. When training finishes, copy final artifacts back to the shared RWX mount — the fast PVC is project-local and not visible to peers.


Storage Permissions

Group storage PVCs have per-user permission levels:

Permission Description
Read Can mount and read files; cannot write
Write Can mount, read, and write files
Admin Full control including permission management

To view or request permission changes, contact your group admin. Group admins can update permissions in bulk from the group's Storage tab (see Groups).


Request Additional Storage

If you need more storage capacity or a new group storage PVC:

  1. Go to Requests → + New Request.
  2. Select type Resource Request.
  3. Describe the storage you need (size, purpose, group name).
  4. Submit — an admin will review and provision the PVC.

Common Questions

My PVC shows as Pending. When will it be available?

Pending means the Kubernetes volume is being provisioned. This usually takes under 1 minute. If it stays Pending for more than 5 minutes, contact your admin.

I can see a group PVC but can't browse it.

You need at least Read permission on the PVC. Contact your group admin to check your permissions.

Files I uploaded via the browser aren't appearing in my workspace.

Check that your workspace has the PVC mounted at the correct path. The mount happens at startup — files uploaded after the workspace started will appear in the already-mounted filesystem without a restart.