-
-
Notifications
You must be signed in to change notification settings - Fork 116
Open
0 / 30 of 3 issues completedOpen
0 / 30 of 3 issues completed
Copy link
Description
The current implementation matches .java and .class files using path, classpath, java packages and compiler conventions. There are cases when we will not have a correct match with these techniques.
For instance, the .class code may not be compiled from Java, but could have been generated directly as bytecode with ASM library or similar bytecode engineering, as this is common with Hibernate and other data framework or SOAP or web services that generate code from @ annotations or XML documents.
To recap:
- Hibernate or JPA code generated from XML or annotations
- XML Schema, SOAP and web services code generated from XML, like with JAXB. See https://stackoverflow.com/questions/11463231/how-to-generate-jaxb-classes-from-xsd
- See also for instance https://github.com/search?q="%40XmlAttribute"+language%3AJava+&type=code ... here is a case where the source is .xsd schema and that binary are generated .class files https://repo1.maven.org/maven2/eu/europa/ec/joinup/sd-dss/specs-xades/6.3.RC1/
- Other non-Java code, like Groovy, or AspectJ.
- ASM-generated or enhanced bytecode
Here the approach would be to:
- Collect source symbols with the "purl2sym" collect_symbols* pipelines or custom processing for XML
- Collect symbols from the binaries, either using lief or using binary strings as collectable in the scancode-toolkit (we are missing a plugin)
- Match the source to binary symbols, sort by the most matches and report correct matches to create a relation between a source and a binary
Sub-issues
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
No status