Skip to content

Commit b6acc86

Browse files
Document the inference tool
1 parent db0237e commit b6acc86

File tree

2 files changed

+55
-6
lines changed

2 files changed

+55
-6
lines changed

README.md

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ _Predicted labels for some randomly chosen samples. Format: prediction (confiden
2424

2525
## Table of Contents
2626

27+
[Quickstart](#quickstart)<br />
2728
[Dataset](#dataset)<br />
2829
[Development](#development)<br />
2930
[Development - Quickstart](#development-quickstart)<br />
@@ -32,6 +33,52 @@ _Predicted labels for some randomly chosen samples. Format: prediction (confiden
3233
[Development - Quickstart - Training and Evaluation](#development-quickstart-training)<br />
3334
[Development - Tools](#development-tools)
3435

36+
## Quickstart
37+
<a name="quickstart"></a>
38+
39+
_Note: These instructions are only for inference using the pre-trained model._
40+
41+
First download the latest release from [releases](https://github.com/AlexGustafsson/compdec/releases). The release contains three files; a pre-trained model, a python script and a Dockerfile.
42+
43+
If you wish not to install all the prerequisites mentioned under [Development - Quickstart](#development-quickstart)<br />, build the Docker image instead like so:
44+
45+
```sh
46+
cd compdec
47+
docker build -t compdec .
48+
```
49+
50+
Now you may the script natively or via Docker:
51+
52+
```sh
53+
# Docker
54+
docker run -it -v "$/path/to/samples:/samples" compdec /samples/unknown-file1.bin /samples/unknown-file2.bin
55+
# Native
56+
python3 ./compdec.py /path/to/samples/unknown-file1.bin /path/to/samples/unknown-file2.bin
57+
```
58+
59+
The tool will produce output like so:
60+
61+
```
62+
/path/to/samples/unknown-file1.bin
63+
7z : 0.00%
64+
brotli : 0.00%
65+
bzip2 : 0.00%
66+
compress : 0.00%
67+
gzip : 0.00%
68+
lz4 : 100.00%
69+
rar : 0.00%
70+
zip : 0.00%
71+
/path/to/samples/unknown-file2.bin
72+
7z : 0.00%
73+
brotli : 0.00%
74+
bzip2 : 0.00%
75+
compress : 100.00%
76+
gzip : 0.00%
77+
lz4 : 0.00%
78+
rar : 0.00%
79+
zip : 0.00%
80+
```
81+
3582
## Dataset
3683
<a name="dataset"></a>
3784

compdec/compdec.py

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,8 @@ def print_version() -> None:
4040
print("Model hash: {}".format(hash))
4141

4242
def load_samples_from_file(sample_path):
43+
import numpy
44+
4345
with open(sample_path, "rb") as sample_file:
4446
sample_file.seek(0, 2)
4547
file_size = sample_file.tell()
@@ -57,13 +59,13 @@ def load_samples_from_file(sample_path):
5759
samples.append(sample)
5860
return samples
5961

60-
def predict(file_paths, model_path):
62+
def predict(sample_paths, model_path):
6163
import tensorflow
6264
import numpy
6365

6466
model = tensorflow.keras.models.load_model(model_path)
65-
for file_path in file_paths:
66-
samples = dataset_utilities.load_samples_from_file(sample_path)
67+
for sample_path in sample_paths:
68+
samples = load_samples_from_file(sample_path)
6769

6870
if len(samples) == 0:
6971
print("There are no chunks big enough in the sample file. Expected at least {}B".format(CHUNK_SIZE))
@@ -77,9 +79,9 @@ def softmax(predictions):
7779
prediction_sum = sum(predictions)
7880
normalized_predictions = softmax(prediction_sum)
7981

80-
print(file_path)
81-
for i in range(len(dataset_utilities.CLASS_NAMES)):
82-
print("{:9}: {:2.2f}%".format(dataset_utilities.CLASS_NAMES[i], normalized_predictions[i] * 100))
82+
print(sample_path)
83+
for i in range(len(CLASS_NAMES)):
84+
print("{:9}: {:2.2f}%".format(CLASS_NAMES[i], normalized_predictions[i] * 100))
8385

8486
def main():
8587
parser = ArgumentParser(add_help=False)

0 commit comments

Comments
 (0)