You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Optimize Curvine's data read/write modes into two simple mount point read/write modes:
CacheMode: Reads and writes are based on UFS (Underlying File System), providing read-only cache acceleration. Data is written directly to UFS without passing through Curvine, and UFS is highly perceptible to users. This mode can serve as a read cache accelerator and a unified proxy for UFS.
FsMode: Reads and writes are based on Curvine, with metadata managed independently by Curvine and read/write cache acceleration provided. UFS only acts as the cold storage layer for Curvine, and UFS is transparent to users. This mode better supports POSIX semantic read/write operations and accelerates the performance of large-scale file processing.
FsMode Goals
Description
Unified Entry
All operations are performed through Curvine paths; applications do not access UFS directly.
POSIX Semantics
Supports complete POSIX file system semantics (directory tree, random read/write, renaming, atomicity, strong consistency, etc.).
Tiered Storage
The Curvine layer stores hot data (metadata + optional local blocks), and UFS stores persistent/cold data replicas.
Background Flushing
The Master periodically submits Load/Dump tasks to flush Curvine data to UFS based on operations and policies.
UFS-Only Replicas
Supports scenarios where data only exists in UFS (e.g., S3), with on-demand backfilling or direct reading during data access.
2. FsMode
2.1 Semantics
FsMode refers to the mount write mode for hierarchical file systems, with the following clear semantics:
All I/O operations go through Curvine: Applications only use Curvine paths and do not access UFS directly. If Curvine is bypassed for UFS read/write operations, data consistency is not guaranteed, and users are advised to avoid such operations as much as possible.
Write path: Data is first written to Curvine (metadata + blocks), and the Master-side policy periodically submits Load/Dump tasks in the background to flush the data to UFS (e.g., S3).
Read path: Priority is given to reading from Curvine; if the data has been evicted or only exists in UFS, the data is backfilled via Load (UFS→Curvine) or read directly from UFS.
Replica state: It is allowed that data only exists in UFS, where UFS data serves as the only data replica of the file.
Metadata synchronization: The mount operation synchronizes all metadata of the directory, and no active full metadata synchronization is performed subsequently. However, a command (reload-meta) can be provided to synchronize the mount point metadata (only updating the metadata of files that exist solely in UFS to Curvine's metadata; no operations are performed on other types of files).
Cache lazy loading mode: If a file is read with no metadata in Curvine but the file exists in UFS, the read operation fails and a "file does not exist" error is reported. Users can manually trigger the reload-meta command to synchronize the metadata before re-reading the file.
Fault scenarios:
Master failure: Users can only access data through the UFS interface.
Worker failure: For multi-replica data, other replicas remain accessible; for single-replica data, direct reading from UFS is required.
Data is written directly to UFS (transparent write), with tight coupling between applications and UFS
Data is written to Curvine and asynchronously flushed to UFS by the Job Manager; applications only interact with Curvine
Metadata
Metadata is accessed directly from UFS
Metadata is maintained by the Curvine Master and periodically synchronized to UFS, with Curvine taking precedence in case of conflicts. Curvine will not actively detect metadata modifications to UFS via other interfaces.
Reading
Reads from Curvine if the data is cached; an asynchronous task is submitted to load the data into Curvine, and the current read is performed directly from UFS
Priority is given to reading from Curvine; if no data exists in Curvine, the Master marks the file as hot data and backfills it to Curvine, with the current read performed directly from UFS
Data Expiration
Deletes both metadata and data blocks in Curvine
Only deletes data blocks in Curvine, retaining metadata
Consistency
Constrained by UFS (e.g., S3's eventual consistency)
Strong consistency on the Curvine side; eventual consistency with UFS via asynchronous tasks
3. Core Processes of FsMode
3.1 Write Process
Metadata
All metadata operations (creation/deletion/renaming, etc.) access Curvine directly and are maintained by the Master.
The Master periodically synchronizes directory and file operations to UFS based on Curvine's metadata journal to keep the UFS namespace consistent with Curvine.
If conflicts are found between Curvine and UFS during synchronization (e.g., duplicate-named files, inconsistent directory structures), Curvine takes precedence and overwrites UFS directly.
Data
Data written by applications is directly persisted to Curvine (blocks allocated by the Master, stored by Workers).
The Job Manager initiates Load tasks to flush data on Curvine to UFS; the flushing trigger can be log-driven, time-scheduled, or event-driven.
3.2 Read Process
If data exists in Curvine: Read the data directly from Curvine.
If data does not exist in Curvine (e.g., only a UFS replica exists after TTL eviction):
The Master marks the file as hot data.
Submit a Load task (UFS→Curvine) to the Job Manager.
After the data is flushed to Curvine, read the data from Curvine (or read directly from UFS as selected by the implementation).
3.3 Data Expiration
Only delete data blocks on Curvine, without deleting metadata.
Metadata is retained to facilitate: visibility of directory listings and file attributes; submission of Load tasks from UFS for data backfilling based on metadata during subsequent read operations.
4. Recommended Extension Points (Configuration and Code)
The following are design-level recommendations and do not involve specific code implementation details.
4.1 Mount Configuration
Add a new enumeration value FsMode to WriteType.
Specify write_type=FsMode (or an equivalent configuration) during mounting, indicating that the mount point uses tiered semantics: write to Curvine, flush to UFS in the background, and allow UFS-only replicas.
4.2 Master-Side Policies
Flushing policy: Based on existing journals, TTL (Time-To-Live) or scheduled tasks, generate LoadJobCommand (source=Curvine, target=UFS) for paths under FsMode mounting periodically or event-driven, and reuse the existing submit_load_job and Worker Load processes.
UFS-only replicas: Reuse the existing TTL + Export and block eviction logic; FsMode only clarifies this behavior as the "expected" storage state.
4.3 Client
UnifiedFileSystem: When write_type == FsMode, data is first written to Curvine, with flushing driven by the Master side. After completion, the client can optionally trigger a single submit_load (compatible with existing behavior) or rely entirely on the Master for periodic flushing.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
1. Overview and Design Goals
Optimize Curvine's data read/write modes into two simple mount point read/write modes:
2. FsMode
2.1 Semantics
FsMode refers to the mount write mode for hierarchical file systems, with the following clear semantics:
reload-meta) can be provided to synchronize the mount point metadata (only updating the metadata of files that exist solely in UFS to Curvine's metadata; no operations are performed on other types of files).reload-metacommand to synchronize the metadata before re-reading the file.2.2 Comparison between CacheMode and FsMode
3. Core Processes of FsMode
3.1 Write Process
Metadata
Data
3.2 Read Process
3.3 Data Expiration
4. Recommended Extension Points (Configuration and Code)
The following are design-level recommendations and do not involve specific code implementation details.
4.1 Mount Configuration
FsModetoWriteType.write_type=FsMode(or an equivalent configuration) during mounting, indicating that the mount point uses tiered semantics: write to Curvine, flush to UFS in the background, and allow UFS-only replicas.4.2 Master-Side Policies
LoadJobCommand(source=Curvine, target=UFS) for paths under FsMode mounting periodically or event-driven, and reuse the existingsubmit_load_joband Worker Load processes.4.3 Client
write_type == FsMode, data is first written to Curvine, with flushing driven by the Master side. After completion, the client can optionally trigger a singlesubmit_load(compatible with existing behavior) or rely entirely on the Master for periodic flushing.Beta Was this translation helpful? Give feedback.
All reactions