Skip to content

extractcode crash with invalid gzip archive #2723

Open
@sutula

Description

@sutula

Description

In normal use of extractcode, I ran across a number of cases where extractcode crashes. So far, I've investigated one of these cases.

The target archive is here. Within the Ruby Rake zipfile is what appears to be a compressed documentation file, rake-0.9.2.2/doc/rake.1.gz which could be directly downloaded here if you wish.

Manually trying to unzip the offending rake.1.gz file, I get:

gunzip rake.1.gz
gzip: rake.1.gz: unexpected end of file

echo $?
1

Running extractcode on the entire v0.9.2.2.zip produces the output in the attached file, basically a crash when extractcode attempts to copy the extracted output into the destination tree. I'm guessing this is a case where a failure of the archive unpack utility is not detected.

What should be done in cases like this? As a user, I'd like to know that portions of the extract failed, in case I want to investigate further. But in the mean time, I'd like to see extract proceed, continuing to try to extract other portions of the the archive. The end goal is license analysis of the package, and it's helpful to have license results for the non-corrupt portions of the package, even if one particular file or subdirectory is not able to be scanned.

My test system is: Debian Linux, python 3.7.3, scancode-toolkit 30.1.0, installed from download of released Linux tar archive

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions