-
Notifications
You must be signed in to change notification settings - Fork 362
Description
Describe the bug
When analyzing Google's V8 codebase, CdtParser consistently fails with ParserException when encountering various advanced C++ template metaprogramming constructs.
The parser reports an ASTAmbiguousNode error for several different, complex template expressions, including (but not limited to):
- Template non-type parameters:
std::get<I>(tpl)
- Generic lambdas with fold expressions:
ApplyIndex<...>([](auto... I) { ... })
- Complex templated casting functions:
CastExposedTrustedObjectByTag<tag>(object)
Because these constructs are often located in widely-used header files within V8 and its dependencies, this leads to a chain reaction of parsing failures. As a result, many critical source files are not included in the final CPG, rendering it incomplete for a thorough analysis.
To Reproduce
Steps to reproduce the behavior:
- Set up depot_tools and fetch the V8 source code.
# Clone the depot_tools repository
git clone https://chromium.googlesource.com/chromium/tools/depot_tools.git
# Add depot_tools to your PATH for the current session
export PATH=$(pwd)/depot_tools:$PATH
# Create a directory for v8 and fetch the source code
mkdir v8_project && cd v8_project
fetch v8
# Navigate into the v8 source directory
cd v8
- Install build dependencies.
./build/install-build-deps.sh
- Build d8
git checkout 14.2.18
gclient sync
# Generate build files and the compilation database
gn gen out/Default
# Build the d8 shell to ensure all necessary source files are generated
ninja -C out/Default d8
- Create compile_commands.json in the out/Default directory.
Fill the absolute path of out/Default in the directory field.
[
{
"file": "../../src/interpreter/interpreter-generator.cc",
"directory": "<absolute path of out/Default>",
"command": " ../../third_party/llvm-build/Release+Asserts/bin/clang++ -MMD -MF obj/v8_initializers/interpreter-generator.o.d -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D_FORTIFY_SOURCE=2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_GNU_SOURCE \"-DCR_CLANG_REVISION=\\\"llvmorg-21-init-16348-gbd809ffb-17\\\"\" -D_LIBCPP_HARDENING_MODE=_LIBCPP_HARDENING_MODE_EXTENSIVE -D_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -D_LIBCXXABI_DISABLE_VISIBILITY_ANNOTATIONS -D_LIBCPP_INSTRUMENTED_WITH_ASAN=0 -DCR_LIBCXX_REVISION=23b5bc93867b93b73f7be97cf2e8a71e95770e07 -DCR_SYSROOT_KEY=20250129T203412Z-1 -DUSE_UDEV -DUSE_AURA=1 -DUSE_GLIB=1 -DUSE_OZONE=1 -DNDEBUG -DNVALGRIND -DDYNAMIC_ANNOTATIONS_ENABLED=0 -DV8_TYPED_ARRAY_MAX_SIZE_IN_HEAP=64 -DENABLE_GDB_JIT_INTERFACE -DV8_INTL_SUPPORT -DV8_TEMPORAL_SUPPORT -DV8_USE_EXTERNAL_STARTUP_DATA -DV8_ATOMIC_OBJECT_FIELD_WRITES -DV8_ENABLE_LAZY_SOURCE_POSITIONS -DV8_WIN64_UNWINDING_INFO -DV8_ENABLE_REGEXP_INTERPRETER_THREADED_DISPATCH -DV8_ENABLE_FUZZTEST -DV8_SHORT_BUILTIN_CALLS -DV8_EXTERNAL_CODE_SPACE -DV8_ENABLE_SPARKPLUG -DV8_ENABLE_MAGLEV -DV8_ENABLE_TURBOFAN -DV8_ENABLE_WEBASSEMBLY -DV8_ENABLE_CONTINUATION_PRESERVED_EMBEDDER_DATA -DV8_ALLOCATION_FOLDING -DV8_ALLOCATION_SITE_TRACKING -DV8_ADVANCED_BIGINT_ALGORITHMS -DV8_STATIC_ROOTS -DV8_USE_ZLIB -DV8_USE_LIBM_TRIG_FUNCTIONS -DV8_ENABLE_WASM_SIMD256_REVEC -DV8_ENABLE_MAGLEV_GRAPH_PRINTER -DV8_ENABLE_BUILTIN_JUMP_TABLE_SWITCH -DV8_ENABLE_EXTENSIBLE_RO_SNAPSHOT -DV8_ENABLE_BLACK_ALLOCATED_PAGES -DV8_ENABLE_LEAPTIERING -DV8_WASM_RANDOM_FUZZERS -DV8_ARRAY_BUFFER_INTERNAL_FIELD_COUNT=0 -DV8_ARRAY_BUFFER_VIEW_INTERNAL_FIELD_COUNT=0 -DV8_PROMISE_INTERNAL_FIELD_COUNT=0 -DV8_USE_DEFAULT_HASHER_SECRET=true -DV8_COMPRESS_POINTERS -DV8_COMPRESS_POINTERS_IN_SHARED_CAGE -DV8_31BIT_SMIS_ON_64BIT_ARCH -DV8_ENABLE_SANDBOX -DV8_DEPRECATION_WARNINGS -DV8_IMMINENT_DEPRECATION_WARNINGS -DV8_HAVE_TARGET_OS -DV8_TARGET_OS_LINUX -DCPPGC_CAGED_HEAP -DCPPGC_YOUNG_GENERATION -DCPPGC_POINTER_COMPRESSION -DCPPGC_ENABLE_LARGER_CAGE -DCPPGC_SLIM_WRITE_BARRIER -DV8_TARGET_ARCH_X64 -DV8_RUNTIME_CALL_STATS -DABSL_ALLOCATOR_NOTHROW=1 -DU_USING_ICU_NAMESPACE=0 -DU_ENABLE_DYLOAD=0 -DUSE_CHROMIUM_ICU=1 -DU_ENABLE_TRACING=1 -DU_ENABLE_RESOURCE_TRACING=0 -DU_STATIC_IMPLEMENTATION -DICU_UTIL_DATA_IMPL=ICU_UTIL_DATA_FILE -DUSE_LIBCXX_MODULES -I../.. -Igen -I../../buildtools/third_party/libc++ -I../../include -Igen/include -I../../third_party/abseil-cpp -I../../third_party/icu/source/common -I../../third_party/icu/source/i18n -I../../third_party/fp16/src/include -Wall -Wextra -Wimplicit-fallthrough -Wextra-semi -Wunreachable-code-aggressive -Wgnu -Wno-gnu-anonymous-struct -Wno-gnu-conditional-omitted-operand -Wno-gnu-include-next -Wno-gnu-label-as-value -Wno-gnu-redeclared-enum -Wno-gnu-statement-expression -Wno-gnu-zero-variadic-macro-arguments -Wno-zero-length-array -Wthread-safety -Wno-missing-field-initializers -Wno-unused-parameter -Wno-psabi -Wloop-analysis -Wno-unneeded-internal-declaration -Wno-cast-function-type -Wno-thread-safety-reference-return -Wno-nontrivial-memcall -Wshadow -Werror -fno-delete-null-pointer-checks -fno-strict-overflow -fno-ident -fno-math-errno -fno-strict-aliasing -fstack-protector -funwind-tables -fPIC -pthread -fcolor-diagnostics -fmerge-all-constants -fno-sized-deallocation -fcrash-diagnostics-dir=../../tools/clang/crashreports -mllvm -instcombine-lower-dbg-declare=0 -mllvm -split-threshold-for-reg-with-hint=0 -ffp-contract=off -Wa,--crel,--allow-experimental-crel -fcomplete-member-pointers --target=x86_64-unknown-linux-gnu -msse3 -Wno-builtin-macro-redefined -D__DATE__= -D__TIME__= -D__TIMESTAMP__= -ffile-compilation-dir=. -no-canonical-prefixes -Xclang -fmodule-file-home-is-cwd -fno-omit-frame-pointer -g0 -fvisibility=hidden -Wheader-hygiene -Wstring-conversion -Wtautological-overlap-compare -Wunreachable-code -Wctad-maybe-unsupported -Xclang -add-plugin -Xclang blink-gc-plugin -Wno-invalid-offsetof -Wshorten-64-to-32 -Wmissing-field-initializers -Wunnecessary-virtual-specifier -O3 -fdata-sections -ffunction-sections -fno-unique-section-names -Wno-invalid-offsetof -Wenum-compare-conditional -Wno-nullability-completeness -std=c++20 -Wno-trigraphs -gsimple-template-names -fno-exceptions -fno-rtti -nostdinc++ -isystemgen/third_party/libc++/src/include -isystem../../third_party/libc++abi/src/include --sysroot=../../build/linux/debian_bullseye_amd64-sysroot -fvisibility-inlines-hidden -fmodules -fno-implicit-module-maps -fno-implicit-modules -Xclang -fmodules-local-submodule-visibility -Wno-modules-ambiguous-internal-linkage -Wno-modules-import-nested-redundant -Wno-module-import-in-extern-c -fbuiltin-module-map -fmodule-map-file=../../build/modules/linux-x64/module.modulemap -fmodule-map-file=gen/third_party/libc++/src/include/module.modulemap -c ../../src/interpreter/interpreter-generator.cc -o obj/v8_initializers/interpreter-generator.o"
}
]
- Run c2cpg.sh
SL_LOGGING_LEVEL=DEBUG c2cpg.sh -J-Xmx30208m --output workspace/v8/cpg.bin.zip --compilation-database out/Default/compile_commands.json .
Expected behavior
Joern's C/C++ parser should be able to handle these modern C++ constructs without throwing a ParserException. The parsing of source files should not be halted due to ambiguities in included headers, and the resulting CPG should be as complete as possible.
Screenshots
N/A, as this is a command-line and logging issue. The relevant logs are provided below.
Desktop (please complete the following information):
- OS: Ubuntu 24.04.2 LTS
- Joern Version: 4.0.413
- Java version: OpenJDK 21.0.8 2025-07-15
Additional context
The root cause appears to be a systemic limitation in the Eclipse CDT parser's ability to resolve ambiguity in advanced C++ features. The recurring theme across all failures is template metaprogramming.
Below are three distinct examples of the ParserException encountered while parsing a single file (interpreter-generator.cc), each triggered by a different complex construct from a different header file. This demonstrates the widespread nature of the issue.
2025-09-05 00:56:12.594 DEBUG CCorePlugin Encountered an ambiguous node "std::get<I>(tpl)" at src/interpreter/../../sys/src/base/src/base/template-utils.h, line 118 while parsing src/interpreter/interpreter-generator.cc
org.eclipse.cdt.internal.core.parser.ParserException: Encountered an ambiguous node "std::get<I>(tpl)" at src/interpreter/../../sys/src/base/src/base/template-utils.h, line 118 while parsing src/interpreter/interpreter-generator.cc
at org.eclipse.cdt.internal.core.dom.parser.ASTAmbiguousNode.logAmbiguousNodeError(ASTAmbiguousNode.java:191)
... (full stack trace)
2025-09-05 00:56:14.136 DEBUG CCorePlugin Encountered an ambiguous node "ApplyIndex<std::tuple_size_v<Tuple>>([](auto... I) { ... })" at src/interpreter/../../libxml/src/ast/src/ast/src/ast/src/objects/src/objects/src/objects/src/objects/src/sandbox/src/sandbox/src/sandbox/src/runtime/fuzztest/internal/fuzztest/internal/fuzztest/fuzztest/internal/domains/aggregate_of_impl.h, line 191 while parsing src/interpreter/interpreter-generator.cc
org.eclipse.cdt.internal.core.parser.ParserException: Encountered an ambiguous node "ApplyIndex<std::tuple_size_v<Tuple>>([](auto... I) { ... })" at ...
at org.eclipse.cdt.internal.core.dom.parser.ASTAmbiguousNode.logAmbiguousNodeError(ASTAmbiguousNode.java:191)
... (full stack trace)
2025-09-05 00:56:14.235 DEBUG CCorePlugin Encountered an ambiguous node "CastExposedTrustedObjectByTag<tag>(object)" at src/interpreter/../../libxml/src/ast/src/ast/src/ast/src/objects/src/objects/src/objects/heap-object.h, line 453 while parsing src/interpreter/interpreter-generator.cc
org.eclipse.cdt.internal.core.parser.ParserException: Encountered an ambiguous node "CastExposedTrustedObjectByTag<tag>(object)" at ...
at org.eclipse.cdt.internal.core.dom.parser.ASTAmbiguousNode.logAmbiguousNodeError(ASTAmbiguousNode.java:191)
... (full stack trace)