[llvm][cas] Fan out persistent storage to reduce filesystem load #11101

ojhunt · 2025-08-02T00:24:27Z

The CAS OnDiskGraphDB backend currently uses a single directory for persistent storage. Over time this can lead to a very large number of files accumulating in one directory. While the point at which this overhead becomes significant on any given file system, all of them eventually reach a point where the number of entries in a single directory starts to seriously degrade the performance of an wide array of operations performed on the entries within that directory or the directory itself.

In this PR we introduce an intermediate layer of subdirectories in order to spread the individual persistent files over a large number of different directories. This is the same general approach that has been taken by a wide array of other applications for exactly the same reasons.

The PR currently uses 2 radix-36 characters giving an approximately 10 bit fan out. There does not seem to be a consistent bias in favour of wider or deeper fan outs, but for now a relatively wide and shallow approach seems reasonable.

We current choose not to use a fan out for temporary files mostly for practical reasons. When fanning out across subdirs we need to ensure that the subdirectories exist, LLVM's current APIs for constructing temporary files do not provide a mechanism for automatically building the required directories. While we could add such functionality, that may actually hinder performance rather than helping: temporary files created by llvm are just that, and so are cleaned up at the end of execution. As a result the risk of large numbers of files accumulating is relatively low. At the same time the additional file system work needed to check for and then create new directory entries is relatively high.

The CAS OnDiskGraphDB backend currently uses a single directory for persistent storage. Over time this can lead to a very large number of files accumulating in one directory. While the point at which this overhead becomes significant on any given file system, all of them eventually reach a point where the number of entries in a single directory starts to seriously degrade the performance of an wide array of operations performed on the entries within that directory or the directory itself. In this PR we introduce an intermediate layer of subdirectories in order to spread the individual persistent files over a large number of different directories. This is the same general approach that has been taken by a wide array of other applications for exactly the same reasons. The PR currently uses 2 radix-36 characters giving an approximately 10 bit fan out. There does not seem to be a consistent bias in favour of wider or deeper fan outs, but for now a relatively wide and shallow approach seems reasonable. We current choose not to use a fan out for temporary files mostly for practical reasons. When fanning out across subdirs we need to ensure that the subdirectories exist, LLVM's current APIs for constructing temporary files do not provide a mechanism for automatically building the required directories. While we could add such functionality, that may actually hinder performance rather than helping: temporary files created by llvm are just that, and so are cleaned up at the end of execution. As a result the risk of large numbers of files accumulating is relatively low. At the same time the additional file system work needed to check for and then create new directory entries is relatively high.

ojhunt · 2025-08-02T00:25:16Z

I'm not super happy with the optional TempDir out parameter I use but everything else seems worse.

It also seems like the CAS version should change but I don't know what the rules around that are.

ojhunt · 2025-08-02T00:27:27Z

llvm/lib/CAS/OnDiskGraphDB.cpp

+
+  // This is around 10 bits of entropy given we're using radix-36
+  static const unsigned IntermediateDirNameLength = 2;
+  static const char Radix36Chars[] = "0123456789abcdefghijklmnopqrstuvwxyz";


@benlangmuir This is a relatively common encoding for people to use in this kind of context but I couldn't find an existing impl anywhere in the tree, and I was not sure that it was so incredibly such that it warranted adding it.

This is how I've usually seen this done in llvm:

toString(llvm::APInt(64, Hash), 36, /*Signed=*/false);

Oops sent before finishing this comment:

There is also std::to_chars, which doesn't need to allocate a string.

ojhunt · 2025-08-02T00:33:05Z

llvm/lib/CAS/OnDiskGraphDB.cpp

+  PersistentPath.assign(RootPath.begin(), RootPath.end());
+  if (TempPath)
+    TempPath->assign(RootPath.begin(), RootPath.end());
+  SmallVector<char, 256> FileNameBuffer;


This buffer is a bit irksome, but the lack of a Twine iterator requires the copy in order to hash the filename.

We could just hash the Offset which would save the copy but given we have prefix and suffix available it seemed reasonable to use them.

I may look into the stable hash interfaces over the weekend to see if this can be performed in a better way.

We should just hash the offset. The prefix is the same for every file in the CAS and each offset only corresponds to a single file, so suffix is redundant with the offset for hashing purposes.

benlangmuir

This requires a CAS format version bump, which you can accomplish by updating FilePrefix at the top of OnDiskGraphDB.cpp. It will probably require some test changes to match.

benlangmuir · 2025-08-04T16:28:07Z

llvm/lib/CAS/OnDiskGraphDB.cpp

+
+  // This is around 10 bits of entropy given we're using radix-36
+  static const unsigned IntermediateDirNameLength = 2;
+  static const char Radix36Chars[] = "0123456789abcdefghijklmnopqrstuvwxyz";


This is how I've usually seen this done in llvm:

toString(llvm::APInt(64, Hash), 36, /*Signed=*/false);

benlangmuir · 2025-08-04T16:31:14Z

llvm/lib/CAS/OnDiskGraphDB.cpp

+  StringRef FileName =
+    (FilePrefix + Twine(I.Offset.get()) + Suffix).toStringRef(FileNameBuffer);
+
+  unsigned FileNameHash = stable_hash_name(FileName);


Unfortunately, stable_hash_... is not stable enough for our purposes. It is documented to be stable across environments and program executions, but critically it is allowed to change in future versions, which would require a CAS format version bump.

benlangmuir · 2025-08-04T16:41:43Z

llvm/lib/CAS/OnDiskGraphDB.cpp

@@ -596,7 +597,17 @@ Error OnDiskGraphDB::TempFile::keep(const Twine &Name) {
  assert(!Done);
  Done = true;
  // Always try to close and rename.
-  std::error_code RenameEC = sys::fs::rename(TmpName, Name);
+  std::error_code RenameEC = sys::fs::create_directories(
+    sys::path::parent_path(Name.str())


I wonder if we should attempt the rename first and only create the directory on failure. That will increase the cost the first time we put a file in that subdirectory (+1 rename call), but decrease it for any subsequent files (-1 mkdir). A couple of small-ish CAS databases I have sitting around have 3000+ standalone files, which would be more than 2 per sub-directory on average. It's not a big deal either way since we can change this without a format change in the future.

benlangmuir · 2025-08-04T16:53:51Z

llvm/lib/CAS/OnDiskGraphDB.cpp

+  PersistentPath.assign(RootPath.begin(), RootPath.end());
+  if (TempPath)
+    TempPath->assign(RootPath.begin(), RootPath.end());
+  SmallVector<char, 256> FileNameBuffer;


We should just hash the offset. The prefix is the same for every file in the CAS and each offset only corresponds to a single file, so suffix is redundant with the offset for hashing purposes.

ojhunt requested a review from benlangmuir August 2, 2025 00:24

ojhunt self-assigned this Aug 2, 2025

ojhunt commented Aug 2, 2025

View reviewed changes

benlangmuir reviewed Aug 4, 2025

View reviewed changes

benlangmuir requested review from akyrtzi and cachemeifyoucan August 4, 2025 16:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[llvm][cas] Fan out persistent storage to reduce filesystem load #11101

[llvm][cas] Fan out persistent storage to reduce filesystem load #11101

Uh oh!

ojhunt commented Aug 2, 2025

Uh oh!

ojhunt commented Aug 2, 2025

Uh oh!

ojhunt Aug 2, 2025

Uh oh!

benlangmuir Aug 4, 2025

Uh oh!

benlangmuir Aug 4, 2025

Uh oh!

ojhunt Aug 2, 2025

Uh oh!

benlangmuir Aug 4, 2025

Uh oh!

benlangmuir left a comment

Uh oh!

benlangmuir Aug 4, 2025

Uh oh!

benlangmuir Aug 4, 2025

Uh oh!

benlangmuir Aug 4, 2025

Uh oh!

benlangmuir Aug 4, 2025

Uh oh!

Uh oh!

[llvm][cas] Fan out persistent storage to reduce filesystem load #11101

Are you sure you want to change the base?

[llvm][cas] Fan out persistent storage to reduce filesystem load #11101

Uh oh!

Conversation

ojhunt commented Aug 2, 2025

Uh oh!

ojhunt commented Aug 2, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

benlangmuir left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!