Issues downloading databases using docker-compose

Hi there - trying to set-up _MMseqs2_ server within my company 
Currently using It via [Boltz](https://github.com/jwohlwend/boltz) which uses [colabfold API](https://api.colabfold.com) internally -  getting rate limited at scale ..

I tried downloading the databases via `docker-compose`
(downloading datasets /w `-v` for future use - saving to AWS S3)
```
docker-compose run --rm -v /db:/opt/mmseqs-web/databases db-setup UniRef100 \
UniRef90 UniRef50 UniProtKB "UniProtKB/TrEMBL" \
"UniProtKB/Swiss-Prot" NR NT GTDB PDB PDB70 \
Pfam-A.full Pfam-A.seed Pfam-B CDD eggNOG VOGDB dbCAN2 SILVA Resfinder Kalamari
```

Error Is `Can not allocate entries memory in IndexTable::initMemory - Error: indexdb died`

Full logs (might be missing some at start)
```
e.g 
createindex /opt/mmseqs-web/databases/UniRef100 /opt/mmseqs-web/databases/tmp_UniRef100 --split 1 

MMseqs Version:                         9c13275673343059cb7e4847c6c89f4b64ce4f9a
Seed substitution matrix                aa:VTML80.out,nucl:nucleotide.out
k-mer length                            0
Alphabet size                           aa:21,nucl:5
Compositional bias                      1
Compositional bias scale                1
Max sequence length                     65535
Max results per query                   300
Mask residues                           1
Mask residues probability               0.9
Mask lower case residues                0
Mask lower letter repeating N times     0
Spaced k-mers                           1
Spaced k-mer pattern               
Sensitivity                             7.5
k-score                                 seq:0,prof:0
Check compatible                        0
Search type                             0
Split database                          1
Split memory limit                      0
Index subset                            0
Verbosity                               3
Threads                                 48
Min codons in orf                       30
Max codons in length                    32734
Max orf gaps                            2147483647
Contig start mode                       2
Contig end mode                         2
Orf start mode                          1
Forward frames                          1,2,3
Reverse frames                          1,2,3
Translation table                       1
Translate orf                           0
Use all table starts                    false
Offset of numeric ids                   0
Create lookup                           0
Compressed                              0
Overlap between sequences               0
Sequence split mode                     1
Header split mode                       0
Translation mode                        0
Strand selection                        1
Remove temporary files                  false

indexdb /opt/mmseqs-web/databases/UniRef100 /opt/mmseqs-web/databases/UniRef100 --seed-sub-mat 'aa:VTML80.out,nucl:nucleotide.out' -k 0 --alph-size aa:21,nucl:5 --comp-bias-corr 1 --comp-bias-corr-scale 1 --max-seq-len 65535 --max-seqs 300 --mask 1 --mask-prob 0.9 --mask-lower-case 0 --mask-n-repeat 0 --spaced-kmer-mode 1 -s 7.5 --k-score seq:0,prof:0 --check-compatible 0 --search-type 0 --split 1 --split-memory-limit 0 --index-subset 0 -v 3 --threads 48 

Estimated memory consumption: 2T
Process needs more than 326G main memory.
Increase the size of --split or set it to 0 to automatically optimize target database split.
Write VERSION (0)
Write META (1)
Write SCOREMATRIXNAME (2)
Write SPACEDPATTERN (23)
Write GENERATOR (22)
Write DBR1INDEX (5)
Write DBR1DATA (6)
Write HDR1INDEX (18)
Write HDR1DATA (19)
Write SCOREMATRIX3MER (4)
Write SCOREMATRIX2MER (3)
Index table: counting k-mers
...
[=================================================================] 100.00% 458.07M 17m 21s 449ms
Index table: Masked residues: 3175422676
Can not allocate entries memory in IndexTable::initMemory
Error: indexdb died
```


My host specs (AWS `r5d.12xlarge` instance type - [link](https://instances.vantage.sh/aws/ec2/r5d.12xlarge))
- 48 vCPUs
- ~360GB RAM
- 2.5TB SSD


(P.S - You guys are great - really enjoying seeing your lab publications)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Issues downloading databases using docker-compose #112

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issues downloading databases using docker-compose #112

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions