Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Modules] Delay deserialization of preferred_name attribute at r… #122726

Merged
merged 7 commits into from
Jan 17, 2025

Conversation

VitaNuo
Copy link
Contributor

@VitaNuo VitaNuo commented Jan 13, 2025

…ecord level.

This fixes the incorrect diagnostic emitted when compiling the following snippet

// string_view.h
template<class _CharT>
class basic_string_view;

typedef basic_string_view<char> string_view;

template<class _CharT>
class
__attribute__((__preferred_name__(string_view)))
basic_string_view {
public:
    basic_string_view() 
    {
    }
};

inline basic_string_view<char> foo()
{
  return basic_string_view<char>();
}
// A.cppm
module;
#include "string_view.h"
export module A;

// Use.cppm
module;
#include "string_view.h"
export module Use;
import A;

The diagnostic is

string_view.h:11:5: error: 'basic_string_view<char>::basic_string_view' from module 'A.<global>' is not present in definition of 'string_view' provided earlier

The underlying issue is that deserialization of the preferred_name attribute triggers deserialization of basic_string_view<char>, which triggers the deserialization of the preferred_name attribute again (since it's attached to the basic_string_view template).
The deserialization logic is implemented in a way that prevents it from going on a loop in a literal sense (it detects early on that it has already seen the string_view typedef when trying to start its deserialization for the second time), but leaves the typedef deserialization in an unfinished state. Subsequently, the string_view typedef from the deserialized module cannot be merged with the same typedef from string_view.h, resulting in the above diagnostic.

This PR resolves the problem by delaying the deserialization of the preferred_name attribute until the deserialization of the basic_string_view template is completed. As a result of deferring, the deserialization of the preferred_name attribute doesn't need to go on a loop since the type of the string_view typedef is already known when it's deserialized.

@VitaNuo VitaNuo self-assigned this Jan 13, 2025
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:modules C++20 modules and Clang Header Modules labels Jan 13, 2025
@llvmbot
Copy link
Member

llvmbot commented Jan 13, 2025

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-modules

Author: Viktoriia Bakalova (VitaNuo)

Changes

…ecord level.


Full diff: https://github.com/llvm/llvm-project/pull/122726.diff

6 Files Affected:

  • (modified) clang/include/clang/Serialization/ASTReader.h (+8)
  • (modified) clang/include/clang/Serialization/ASTRecordReader.h (+10)
  • (modified) clang/lib/Serialization/ASTReader.cpp (+5)
  • (modified) clang/lib/Serialization/ASTReaderDecl.cpp (+84-5)
  • (modified) clang/lib/Serialization/ASTWriter.cpp (+12-2)
  • (modified) clang/test/Modules/preferred_name.cppm (+9-3)
diff --git a/clang/include/clang/Serialization/ASTReader.h b/clang/include/clang/Serialization/ASTReader.h
index 9f978762a6fb6b..38f4cf6990296b 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -1205,6 +1205,12 @@ class ASTReader
   /// been completed.
   std::deque<PendingDeclContextInfo> PendingDeclContextInfos;
 
+  struct PendingPreferredNameAttribute {
+    uint64_t RecordIdx;
+    Decl *D;
+  };
+  SmallVector<PendingPreferredNameAttribute> PendingPreferredNameAttributes;
+
   template <typename DeclTy>
   using DuplicateObjCDecls = std::pair<DeclTy *, DeclTy *>;
 
@@ -1551,6 +1557,8 @@ class ASTReader
   void loadPendingDeclChain(Decl *D, uint64_t LocalOffset);
   void loadObjCCategories(GlobalDeclID ID, ObjCInterfaceDecl *D,
                           unsigned PreviousGeneration = 0);
+  void loadPreferredNameAttribute(
+      const PendingPreferredNameAttribute &PreferredNameAttribute);
 
   RecordLocation getLocalBitOffset(uint64_t GlobalOffset);
   uint64_t getGlobalBitOffset(ModuleFile &M, uint64_t LocalOffset);
diff --git a/clang/include/clang/Serialization/ASTRecordReader.h b/clang/include/clang/Serialization/ASTRecordReader.h
index 2561418b78ca7f..252015c472bc00 100644
--- a/clang/include/clang/Serialization/ASTRecordReader.h
+++ b/clang/include/clang/Serialization/ASTRecordReader.h
@@ -337,6 +337,12 @@ class ASTRecordReader
   /// Reads attributes from the current stream position, advancing Idx.
   void readAttributes(AttrVec &Attrs);
 
+  /// Reads one attribute from the current stream position, advancing Idx.
+  Attr *readAttr(Decl *D);
+
+  /// Reads attributes from the current stream position, advancing Idx.
+  void readAttributes(AttrVec &Attrs, Decl *D);
+
   /// Read an BTFTypeTagAttr object.
   BTFTypeTagAttr *readBTFTypeTagAttr() {
     return cast<BTFTypeTagAttr>(readAttr());
@@ -355,6 +361,10 @@ class ASTRecordReader
   SwitchCase *getSwitchCaseWithID(unsigned ID) {
     return Reader->getSwitchCaseWithID(ID);
   }
+
+private:
+  Attr *readAttrImpl(Decl *D);
+  void readAttributesImpl(AttrVec &Attrs, Decl *D);
 };
 
 /// Helper class that saves the current stream position and
diff --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp
index b53f99732cacce..971a2538e9ed74 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -10079,6 +10079,11 @@ void ASTReader::finishPendingActions() {
     }
     PendingDeducedVarTypes.clear();
 
+    // Load the delayed preferred name attributes.
+    for (unsigned I = 0; I != PendingPreferredNameAttributes.size(); ++I)
+      loadPreferredNameAttribute(PendingPreferredNameAttributes[I]);
+    PendingPreferredNameAttributes.clear();
+
     // For each decl chain that we wanted to complete while deserializing, mark
     // it as "still needs to be completed".
     for (unsigned I = 0; I != PendingIncompleteDeclChains.size(); ++I) {
diff --git a/clang/lib/Serialization/ASTReaderDecl.cpp b/clang/lib/Serialization/ASTReaderDecl.cpp
index dee5169ae5723a..7aacfaab532e5e 100644
--- a/clang/lib/Serialization/ASTReaderDecl.cpp
+++ b/clang/lib/Serialization/ASTReaderDecl.cpp
@@ -612,7 +612,7 @@ void ASTDeclReader::VisitDecl(Decl *D) {
 
   if (HasAttrs) {
     AttrVec Attrs;
-    Record.readAttributes(Attrs);
+    Record.readAttributes(Attrs, D);
     // Avoid calling setAttrs() directly because it uses Decl::getASTContext()
     // internally which is unsafe during derialization.
     D->setAttrsImpl(Attrs, Reader.getContext());
@@ -3118,13 +3118,22 @@ class AttrReader {
     return Reader.readVersionTuple();
   }
 
+  void skipInts(unsigned N) {
+    Reader.skipInts(N);
+  }
+
+  unsigned getCurrentIdx() {
+    return Reader.getIdx();
+  }
+
   OMPTraitInfo *readOMPTraitInfo() { return Reader.readOMPTraitInfo(); }
 
   template <typename T> T *readDeclAs() { return Reader.readDeclAs<T>(); }
 };
 }
 
-Attr *ASTRecordReader::readAttr() {
+/// Reads one attribute from the current stream position, advancing Idx.
+Attr *ASTRecordReader::readAttrImpl(Decl *D) {
   AttrReader Record(*this);
   auto V = Record.readInt();
   if (!V)
@@ -3134,6 +3143,17 @@ Attr *ASTRecordReader::readAttr() {
   // Kind is stored as a 1-based integer because 0 is used to indicate a null
   // Attr pointer.
   auto Kind = static_cast<attr::Kind>(V - 1);
+  if (Kind == attr::PreferredName && D != nullptr) {
+    if (D != nullptr) {
+      Reader->PendingPreferredNameAttributes.push_back(
+          {Record.getCurrentIdx() - 1, D});
+      auto SkipCount = Record.readInt();
+      Record.skipInts(SkipCount);
+      return nullptr;
+    }
+    // Ignore the skip count when resolving pending actions.
+    Record.readInt();
+  }
   ASTContext &Context = getContext();
 
   IdentifierInfo *AttrName = Record.readIdentifier();
@@ -3159,13 +3179,27 @@ Attr *ASTRecordReader::readAttr() {
   return New;
 }
 
-/// Reads attributes from the current stream position.
-void ASTRecordReader::readAttributes(AttrVec &Attrs) {
+void ASTRecordReader::readAttributesImpl(AttrVec &Attrs, Decl *D) {
   for (unsigned I = 0, E = readInt(); I != E; ++I)
-    if (auto *A = readAttr())
+    if (auto *A = readAttr(D))
       Attrs.push_back(A);
 }
 
+Attr *ASTRecordReader::readAttr() { return readAttrImpl(nullptr); }
+
+/// Reads attributes from the current stream position.
+void ASTRecordReader::readAttributes(AttrVec &Attrs) {
+  readAttributesImpl(Attrs, nullptr);
+}
+
+/// Reads one attribute from the current stream position, advancing Idx.
+Attr *ASTRecordReader::readAttr(Decl *D) { return readAttrImpl(D); }
+
+/// Reads attributes from the current stream position, advancing Idx.
+void ASTRecordReader::readAttributes(AttrVec &Attrs, Decl *D) {
+  readAttributesImpl(Attrs, D);
+}
+
 //===----------------------------------------------------------------------===//
 // ASTReader Implementation
 //===----------------------------------------------------------------------===//
@@ -4424,6 +4458,51 @@ void ASTReader::loadPendingDeclChain(Decl *FirstLocal, uint64_t LocalOffset) {
   ASTDeclReader::attachLatestDecl(CanonDecl, MostRecent);
 }
 
+void ASTReader::loadPreferredNameAttribute(
+    const PendingPreferredNameAttribute &PreferredNameAttribute) {
+  Decl *D = PreferredNameAttribute.D;
+  ModuleFile *M = getOwningModuleFile(D);
+
+  unsigned LocalDeclIndex = D->getGlobalID().getLocalDeclIndex();
+  const DeclOffset &DOffs = M->DeclOffsets[LocalDeclIndex];
+  RecordLocation Loc(M, DOffs.getBitOffset(M->DeclsBlockStartOffset));
+
+  llvm::BitstreamCursor &Cursor = Loc.F->DeclsCursor;
+  SavedStreamPosition SavedPosition(Cursor);
+  if (llvm::Error Err = Cursor.JumpToBit(Loc.Offset)) {
+    Error(std::move(Err));
+  }
+
+  Expected<unsigned> MaybeCode = Cursor.ReadCode();
+  if (!MaybeCode) {
+    llvm::report_fatal_error(
+        Twine("ASTReader::loadPreferredNameAttribute failed reading code: ") +
+        toString(MaybeCode.takeError()));
+  }
+  unsigned Code = MaybeCode.get();
+
+  ASTRecordReader Record(*this, *Loc.F);
+  Expected<unsigned> MaybeRecCode = Record.readRecord(Cursor, Code);
+  if (!MaybeRecCode) {
+    llvm::report_fatal_error(
+        Twine(
+            "ASTReader::loadPreferredNameAttribute failed reading rec code: ") +
+        toString(MaybeCode.takeError()));
+  }
+  unsigned RecCode = MaybeRecCode.get();
+  if (RecCode != DECL_CXX_RECORD) {
+    llvm::report_fatal_error(
+        Twine("ASTReader::loadPreferredNameAttribute failed reading rec code: "
+              "expected CXXRecord") +
+        toString(MaybeCode.takeError()));
+  }
+
+  Record.skipInts(PreferredNameAttribute.RecordIdx);
+  Attr *PreferredNameAttr = Record.readAttr(nullptr);
+  AttrVec &Attrs = getContext().getDeclAttrs(D);
+  Attrs.push_back(PreferredNameAttr);
+}
+
 namespace {
 
   /// Given an ObjC interface, goes through the modules and links to the
diff --git a/clang/lib/Serialization/ASTWriter.cpp b/clang/lib/Serialization/ASTWriter.cpp
index 39004fd4d4c376..c96a635592cd8a 100644
--- a/clang/lib/Serialization/ASTWriter.cpp
+++ b/clang/lib/Serialization/ASTWriter.cpp
@@ -37,6 +37,7 @@
 #include "clang/AST/Type.h"
 #include "clang/AST/TypeLoc.h"
 #include "clang/AST/TypeLocVisitor.h"
+#include "clang/Basic/AttrKinds.h"
 #include "clang/Basic/Diagnostic.h"
 #include "clang/Basic/DiagnosticOptions.h"
 #include "clang/Basic/FileEntry.h"
@@ -4909,12 +4910,16 @@ void ASTRecordWriter::AddAttr(const Attr *A) {
   // FIXME: Clang can't handle the serialization/deserialization of
   // preferred_name properly now. See
   // https://github.com/llvm/llvm-project/issues/56490 for example.
-  if (!A || (isa<PreferredNameAttr>(A) &&
-             Writer->isWritingStdCXXNamedModules()))
+  if (!A)
     return Record.push_back(0);
 
   Record.push_back(A->getKind() + 1); // FIXME: stable encoding, target attrs
 
+  auto SkipIdx = Record.size();
+  if (A->getKind() == attr::PreferredName)
+    // Add placeholder for the size of preferred_name attribute.
+    Record.push_back(0);
+
   Record.AddIdentifierRef(A->getAttrName());
   Record.AddIdentifierRef(A->getScopeName());
   Record.AddSourceRange(A->getRange());
@@ -4925,6 +4930,11 @@ void ASTRecordWriter::AddAttr(const Attr *A) {
   Record.push_back(A->isRegularKeywordAttribute());
 
 #include "clang/Serialization/AttrPCHWrite.inc"
+
+  if (A->getKind() == attr::PreferredName)
+    // Record the actual size of preferred_name attribute (-1 to count the
+    // placeholder).
+    Record[SkipIdx] = Record.size() - SkipIdx - 1;
 }
 
 /// Emit the list of attributes to the specified record.
diff --git a/clang/test/Modules/preferred_name.cppm b/clang/test/Modules/preferred_name.cppm
index 806781a81c5ca7..4a73961683bdd3 100644
--- a/clang/test/Modules/preferred_name.cppm
+++ b/clang/test/Modules/preferred_name.cppm
@@ -53,10 +53,16 @@ import A;
 export using ::foo_templ;
 
 //--- Use1.cpp
-import A;         // [email protected]:8 {{attribute declaration must precede definition}}
-#include "foo.h"  // [email protected]:9 {{previous definition is here}}
-
+// expected-no-diagnostics
+import A;
+#include "foo.h"
 //--- Use2.cpp
 // expected-no-diagnostics
 #include "foo.h"
 import A;
+
+//--- Use3.cpp
+#include "foo.h"
+import A;
+foo test;
+int size = test.size(); // expected-error {{no member named 'size' in 'foo'}}
\ No newline at end of file

@VitaNuo
Copy link
Contributor Author

VitaNuo commented Jan 13, 2025

Alternative approach to #122250.

Copy link

github-actions bot commented Jan 13, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@ilya-biryukov ilya-biryukov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is marked as WIP, but I find the direction promising so I went ahead and actually made a round of review.

Please wait for @ChuanqiXu9 in case he'll have more comments, but I hope two of us are not too far off in terms of where we want this to go.

@@ -1205,6 +1205,12 @@ class ASTReader
/// been completed.
std::deque<PendingDeclContextInfo> PendingDeclContextInfos;

struct PendingPreferredNameAttribute {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: there's nothing specific to PreferredName her, could we just call it DeferredAttribute?
(Same for functions that have it as a substring)

@@ -3134,6 +3139,17 @@ Attr *ASTRecordReader::readAttr() {
// Kind is stored as a 1-based integer because 0 is used to indicate a null
// Attr pointer.
auto Kind = static_cast<attr::Kind>(V - 1);
if (Kind == attr::PreferredName && D != nullptr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment explaining why we need to defer some attribute?

@@ -3134,6 +3139,17 @@ Attr *ASTRecordReader::readAttr() {
// Kind is stored as a 1-based integer because 0 is used to indicate a null
// Attr pointer.
auto Kind = static_cast<attr::Kind>(V - 1);
if (Kind == attr::PreferredName && D != nullptr) {
if (D != nullptr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D is never null here, this if seems to be redundant.

Also, should we assert that D != nullptr when we deserialize PreferredName?
If we start deferring more attributes at some point, an assertion like this might not hold up, but it gives a good sanity check for preferred name specifically, because it should always be attached to some declaration.

Copy link
Contributor Author

@VitaNuo VitaNuo Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D is never null here, this if seems to be redundant.

This is actually not true. This is a trick that allows to send deserialization on two different paths when (1) deserializing the decl in the main deserialization round vs. (2) deserializing the pending attribute.

In (1) Decl is not null, and the attribute is deferred (and the Decl stored in the pending attributes structure), whereas in (2) the deserialization of the attribute has been initiated by processing the pending attributes, and we don't want to send it on a circle by deferring it again, so we set the Decl to nullptr in this call readOrDeferAttrImpl.

The null check in line 3142 is wrong though. It's an artefact of my experiments, sorry.

}

/// Reads one attribute from the current stream position, advancing Idx.
Attr *ASTRecordReader::readAttr(Decl *D) { return readAttrImpl(D); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behavior of this funciton is drastically different from readAttr, could we use a different name and document that it delays reading of some attributes?

Same for readAttributes

@@ -1205,6 +1205,12 @@ class ASTReader
/// been completed.
std::deque<PendingDeclContextInfo> PendingDeclContextInfos;

struct PendingPreferredNameAttribute {
uint64_t RecordIdx;
Decl *D;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use a more descriptive name and/or add a comment explaining why we have this Decl here?
D looks fine for parameters, but a little too short for a struct.

Decl *D = PreferredNameAttribute.D;
ModuleFile *M = getOwningModuleFile(D);

unsigned LocalDeclIndex = D->getGlobalID().getLocalDeclIndex();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we might need to store the indices separately, but having the declaration around we actually have enough to look up the bitstream position.

Nice trick, kudos for coming up with this!

toString(MaybeCode.takeError()));
}
unsigned RecCode = MaybeRecCode.get();
if (RecCode != DECL_CXX_RECORD) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to limit this to this particular Decl?
I think having a generic mechanism that works for any decls is fine here, just in case we need to defer more attributes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. Though I think I still need to check that RecCode is a valid DeclCode.

Record.skipInts(PreferredNameAttribute.RecordIdx);
Attr *PreferredNameAttr = Record.readAttr(nullptr);
AttrVec &Attrs = getContext().getDeclAttrs(D);
Attrs.push_back(PreferredNameAttr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: maybe merge the two lines?

getContext().getDeclAttrs(D).push_back(PreferredNameAttr)

@@ -4925,6 +4930,11 @@ void ASTRecordWriter::AddAttr(const Attr *A) {
Record.push_back(A->isRegularKeywordAttribute());

#include "clang/Serialization/AttrPCHWrite.inc"

if (A->getKind() == attr::PreferredName)
// Record the actual size of preferred_name attribute (-1 to count the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM Style Guide covers this situation and suggests to have braces when there are comments in the single branch, see https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements.

Could you add braces here?
The if above is a bit more borderline, but I'd also suggest adding braces there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, thank you.

@VitaNuo VitaNuo force-pushed the fix_preferred_name_alternative_approach branch from bd13df1 to b611109 Compare January 14, 2025 13:01
@AaronBallman AaronBallman requested a review from Endilll January 14, 2025 13:18
@AaronBallman
Copy link
Collaborator

Can you add more details to the patch summary explaining why the changes are needed? That makes it easier for folks to review the patch but also helps us in the future when digging through historical changes. Thanks!

@VitaNuo
Copy link
Contributor Author

VitaNuo commented Jan 14, 2025

Can you add more details to the patch summary explaining why the changes are needed? That makes it easier for folks to review the patch but also helps us in the future when digging through historical changes. Thanks!

Sure, thank you for the comment.

@Endilll
Copy link
Contributor

Endilll commented Jan 14, 2025

I don't feel qualified to review this, but I appreciate that you fix holes in my original quite naive implementation of this attribute.

@VitaNuo VitaNuo changed the title [WIP][Modules] Delay deserialization of preferred_name attribute at r… [Modules] Delay deserialization of preferred_name attribute at r… Jan 14, 2025
Copy link
Member

@ChuanqiXu9 ChuanqiXu9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

(1) Hardcoding is generally not good. And even if we have to, e.g., (we don't have an automatic mechanism to check if we need to defer an attribute), let's avoid hardcoding in both reader side and writer side. We can make this by adding a new bit in the record of attribute to indicate if we need to delay it or not in the reader side generally.

(2) Then I think we can keep the signature of readAttr as is, where it will always skip the check for the delay bit and always try to read it. Then we can add a readOrDeferFor(Decl *D) method as you did. The readOrDeferFor will try to read the deferring bits directly and if it is true, it will record the length skipCount and D to PendingDeferredAttributes . Otherwise, it will call readAttr. I think it is more clear.

/// More attributes that store TypeSourceInfo might be potentially affected,
/// see https://github.com/llvm/llvm-project/issues/56490 for details.
struct DeferredAttribute {
uint64_t RecordIdx;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Add a comment to the field. e.g., something like the Index of the delayed attribute in the Record of the targeted decl.

struct DeferredAttribute {
uint64_t RecordIdx;
// Decl to attach a deferred attribute to.
Decl *ParentDecl;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Decl *ParentDecl;
Decl *TargetedDecl;

@llvmbot llvmbot added the clang:frontend Language frontend issues, e.g. anything involving "Sema" label Jan 16, 2025
Copy link
Member

@ChuanqiXu9 ChuanqiXu9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM basically.

@@ -713,6 +713,10 @@ class Attr {
// attribute may be documented under multiple categories, more than one
// Documentation entry may be listed.
list<Documentation> Documentation;
// Set to true if deserialization of this attribute must be deferred until
// the parent Decl is fully deserialized (during header module file
// deserialization).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It reads better if we can give an example to describe when we want to defer it.

@@ -3240,6 +3244,7 @@ def PreferredName : InheritableAttr {
let InheritEvenIfAlreadyPresent = 1;
let MeaningfulToClassTemplateDefinition = 1;
let TemplateDependent = 1;
let DeferDeserialization = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is better to add comment here to explain why we want to defer PreferredName.

Comment on lines +10082 to +10085
// Load the delayed preferred name attributes.
for (unsigned I = 0; I != PendingDeferredAttributes.size(); ++I)
loadDeferredAttribute(PendingDeferredAttributes[I]);
PendingDeferredAttributes.clear();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Load the delayed preferred name attributes.
for (unsigned I = 0; I != PendingDeferredAttributes.size(); ++I)
loadDeferredAttribute(PendingDeferredAttributes[I]);
PendingDeferredAttributes.clear();
// Load the delayed preferred name attributes.
while (!PendingDeferredAttributes.empty()) {
auto DeferredAttributes = std::move(PendingDeferredAttributes);
for (DeferredAttribute &DA : DeferredAttribute)
loadDeferredAttribute(DA);
}

Technically, it is possible to update PendingDeferredAttributes during loadDeferredAttribute().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. The original snippet seems more concise though, so it might be preferable to stick to it for now.

Comment on lines 3182 to 3184
unsigned SkipCount = Record.readInt();
if (!SkipCount)
return readAttr();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it assumes that readAttr() should be called after the skip bit. Then the caller has more responsibility. I feel it better to skip it in readAttr and not read here.

e.g.,

Suggested change
unsigned SkipCount = Record.readInt();
if (!SkipCount)
return readAttr();
unsigned SkipCount = Record.peekInt();
if (!SkipCount)
return readAttr();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah nice idea. I didn't realize peekInt() is an option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense! However, I also had to reorder the fields to write the attribute kind before the skip count, since some code paths depend on reading the (absent) attribute kind first and exiting the readAtrr function.

@VitaNuo
Copy link
Contributor Author

VitaNuo commented Jan 16, 2025

Thank you for the review @ChuanqiXu9!

(1) Hardcoding is generally not good. And even if we have to, e.g., (we don't have an automatic mechanism to check if we need to defer an attribute), let's avoid hardcoding in both reader side and writer side. We can make this by adding a new bit in the record of attribute to indicate if we need to delay it or not in the reader side generally.

Added a DeferDeserialization bit to Attr class that is set to true for the preferred_name and to false for all other attributes.
Depending on this field, the skip count field is set on the record (to the count of record fields to skip on initial read for the preferred_name, and to 0 otherwise).

(2) Then I think we can keep the signature of readAttr as is, where it will always skip the check for the delay bit and always try to read it. Then we can add a readOrDeferFor(Decl *D) method as you did. The readOrDeferFor will try to read the deferring bits directly and if it is true, it will record the length skipCount and D to PendingDeferredAttributes . Otherwise, it will call readAttr. I think it is more clear.

Yes this makes sense. It seems like one field (skip count) should be enough.

Copy link
Member

@ChuanqiXu9 ChuanqiXu9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with nits.

Attr *ASTRecordReader::readAttr() {
AttrReader Record(*this);
auto V = Record.readInt();
if (!V)
return nullptr;

// Read and ignore the skip count, since attribute deserialization is not
// deferred on this pass.
Record.readInt();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: skipInt();

@@ -83,6 +83,9 @@ class ASTRecordReader
/// Returns the current value in this record, without advancing.
uint64_t peekInt() { return Record[Idx]; }

/// Returns the next value in this record, without advancing.
uint64_t peekNextInt() { return Record[Idx + 1]; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
uint64_t peekNextInt() { return Record[Idx + 1]; }
uint64_t peekInts(unsigned N) { return Record[Idx + N]; }

nit:

@VitaNuo VitaNuo merged commit c3ba6f3 into llvm:main Jan 17, 2025
8 checks passed
ilya-biryukov added a commit that referenced this pull request Jan 22, 2025
…t r… (#122726)"

This reverts commit c3ba6f3.

We are seeing performance regressions of up to 40% on some compilations
with this patch, we will investigate and reland after fixing performance
issues.
@benlangmuir
Copy link
Collaborator

I see that this change has already been reverted by @ilya-biryukov , but FYI before this is re-applied: I am seeing crashes from this change if I attempt to bootstrap a build of clang with modules enabled on Darwin. I haven't yet managed to minimize a test case, but compiling the following triggers the issue at least when using the current Darwin libc++

#  include <string_view>

template <class _CharT, class _Traits, class _Tp>
struct __can_be_converted_to_string_view
    : public _BoolConstant< is_convertible<const _Tp&, basic_string_view<_CharT, _Traits> >::value &&
                            !is_convertible<const _Tp&, const _CharT*>::value > {};

The crash is because we get a call to loadDeferredAttribute with a nullptr decl. During the read of basic_string_view from the module we end up calling readAttributes with no decl. Not sure what was supposed to happen here.

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x1c)
  * frame #0: 0x00000001032c293c clang++`clang::Decl::isFromASTFile(this=0x0000000000000000) const at DeclBase.h:786:39
    frame #1: 0x00000001032c28ec clang++`clang::ASTReader::getOwningModuleFile(this=0x000000015102d800, D=0x0000000000000000) const at ASTReader.cpp:8016:11
    frame #2: 0x00000001034642f8 clang++`clang::ASTReader::loadDeferredAttribute(this=0x000000015102d800, DA=0x0000600003cb01e0) at ASTReaderDecl.cpp:4517:19
    frame #3: 0x00000001032d35b4 clang++`clang::ASTReader::finishPendingActions(this=0x000000015102d800) at ASTReader.cpp:10244:7
    frame #4: 0x00000001032db234 clang++`clang::ASTReader::FinishedDeserializing(this=0x000000015102d800) at ASTReader.cpp:10799:5
    frame #5: 0x00000001033111bc clang++`clang::ExternalASTSource::Deserializing::~Deserializing(this=0x000000016fdeedf0) at ExternalASTSource.h:88:15
    frame #6: 0x000000010328fbec clang++`clang::ExternalASTSource::Deserializing::~Deserializing(this=0x000000016fdeedf0) at ExternalASTSource.h:87:22
    frame #7: 0x00000001032c5ddc clang++`clang::ASTReader::FindExternalVisibleDeclsByName(this=0x000000015102d800, DC=0x000000015157f240, Name=(Ptr = 5659675880), OriginalDC=0x000000015157f240) at ASTReader.cpp:8525:1
    frame #8: 0x0000000105794d78 clang++`clang::DeclContext::lookupImpl(clang::DeclarationName, clang::DeclContext const*) const + 716
    frame #9: 0x0000000104e13738 clang++`LookupDirect(clang::Sema&, clang::LookupResult&, clang::DeclContext const*) + 108

@ilya-biryukov
Copy link
Contributor

@benlangmuir thanks for pointing this out, we should definitely keep that in mind when relanding. We'll try to run the bootstrap process to fix that too.

Are the particular steps for the bootstraping you do documented somewhere? E.g. maybe we're lucky and one of the files in https://github.com/llvm/llvm-project/tree/main/clang/cmake/caches can be used for that?

@benlangmuir
Copy link
Collaborator

There's probably a better way, but I just built my first clang normally then built a second one with
-DCMAKE_C_COMPILER and -DCMAKE_CXX_COMPILER pointing to the first one, and -DLLVM_ENABLE_MODULES=1 to enable modules.

hokein added a commit that referenced this pull request Feb 17, 2025
github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Feb 17, 2025
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Feb 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Modules] Incorrect ODR checks caused by preferred_name attribute.
7 participants