Syncing Data with Object Storage

You can sync data both ways between a Lustre file system and an Object Storage bucket. Pull data from Object Storage into Lustre (import data) when you need high-speed access for AI training or data processing. When you're done, push the results back to Object Storage (export data) for cost-effective, long-term storage.

To set up this two-way sync, link a Lustre directory to an Object Storage bucket. Use this link to import objects from Object Storage into Lustre and export files from Lustre back to Object Storage as needed.

Beginning an import or export creates a job. Each job has a unique ID, which you can use to see details.

When you start an import or export job, all changed files and their metadata are copied.

Prerequisites

Before you set up Lustre object sync with Object Storage, ensure:
  • You have at least one Object Storage bucket in the same region and tenancy as the Lustre file system. Cross-region or cross-tenancy import and export isn't supported.
  • The Lustre file system has enough free space to hold data imports from Object Storage.
  • All the required IAM permissions are configured.

Considerations

Keep these points in mind when you sync files between Lustre and Object Storage:
  • Jobs Copy Only New and Changed Files: On the first export for a link, all files in the Lustre file system are copied to the Object Storage bucket because everything is new. Items that existed in the bucket before the export are unchanged. For later exports, only files that are new or have been updated since the last job are copied - deletions aren't mirrored in either direction.
  • Hard Links and Metadata-Only Changes Not Copied: Import or export jobs don't copy files if only their metadata (such as a UID or modification time) has changed since the last job. Also, remember that if you have files that share content through hard links, each is treated as a separate file during import and export - so you lose the hard link between them.
  • Single Region and Tenancy: You can only import and export files between Object Storage buckets and Lustre file systems that are in the same region and tenancy. You can't import or export across regions or tenancies.
  • Single Job Limitation: You can run only one import or export job at a time per file system. If many links belong to the same file system, their jobs can't run concurrently. However, jobs on different file systems can run at the same time.
  • 10 Link Limit: You can create a maximum of 10 Object Storage links for each Lustre file system. If you need more links, contact support.
  • Editing Links: You can edit links to update these properties: name, Object Storage compartment, tags, and whether to overwrite or skip conflicted files. To make any other changes, delete the link and create a new one.
  • Expansion Cool Down Period: Sometimes, you might need to expand the Lustre file system to accommodate data from Object Storage. A six hour cool down applies between each consecutive Lustre file system expansion for the same resource. If you make an expansion request within this cool down period, it's rejected.
  • Performance Impact: File syncing between Object Storage and Lustre consumes bandwidth and might slightly affect Lustre's performance during the sync process.

Best Practices

Here are some best practices to follow when syncing files between Lustre and Object Storage:
  • Keep Paths Unique: When you create Object Storage, don't use overlapping paths, where parts of the Lustre directory or Object Storage bucket path are already used by another link for the same file system. Overlapping links might lead to deep copies and an undesirable directory structure.

    For example, link /mnt/lustre/projectA to mybucket/projectA and link /mnt/lustre/projectB to mybucket/projectB. This is correct because link uses a unique Lustre path and a unique Object Storage bucket prefix.

    Don't link /mnt/lustre/project to mybucket/projects and /mnt/lustre/project/reports to mybucket/project/reports because the reports folder gets mapped twice, causing duplicate syncs, unexpected nesting, and conflicts.

  • Make File Changes Only When Import or Export Jobs Aren't Running: Only update the content of a file when you're certain it isn't being imported or exported. Changing files while a sync job is in progress can lead to unexpected results, such as files being ignored or overwritten.

Data Encryption

Here is how data is encrypted in this bidirectional transfer between Lustre and Object Storage:
  • Data in Transit: All data transferred between Lustre and Object Storage is encrypted during transit.
  • Data at Rest: Imported data is encrypted at rest using block volume encryption, and data exported to Object Storage uses the Object Storage encryption mechanisms.