-
Notifications
You must be signed in to change notification settings - Fork 36
Open
Description
- Until
v1.0.3
,pairtools sort
allows the header line to list column nameschr1
andchr2
(as indicated in official 4DN specs). - Starting with
v1.1.0
,pairtools sort
now expects the header line indicating column names to listchrom1
andchrom2
, and breaks if the header line is#columns: readID chr1 pos1 chr2 pos2 strand1 strand2
. - It also seem to require
pair_type
to be present in the#columns
in the header, as well as in a column.
I understand that the chr1
/chr2
can be circumvented by specifying -c1
and -c2
fields in CLI, but now if a pair_type
column is not included, pairtools sort
cannot work. Is this an intended behavior? Sorry if I missed something or if this issue has already been raised.
Reproducible example
- Here is an unsorted pairs file I created by hand, with
chr1
/chr2
in header:
echo -e "## pairs format v1.0
#columns: readID chr1 pos1 chr2 pos2 strand1 strand2
#sorted: readID
#shape: upper triangle
#chromsize: NODE_522 22786
#chromsize: NODE_1404 15015
#chromsize: NODE_1814 13236
NS500150:497:HWH2WBGXC:4:23605:21900:3336\tNODE_1404\t461\tNODE_1404\t246\t --
NS500150:497:HWH2WBGXC:4:23603:4102:4882\tNODE_522\t6855\tNODE_1404\t1035\t--
NS500150:497:HWH2WBGXC:4:23606:10802:17906\tNODE_1404\t1441\tNODE_1814\t4433\t--" > tmp.pairs
This works
pip install pairtools==1.0.3
pairtools sort tmp.pairs
## pairs format v1.0
#sorted: readID
#shape: upper triangle
#chromsize: NODE_522 22786
#chromsize: NODE_1404 15015
#chromsize: NODE_1814 13236
#columns: readID chr1 pos1 chr2 pos2 strand1 strand2
NS500150:497:HWH2WBGXC:4:23605:21900:3336 NODE_1404 461 NODE_1404 246 --
NS500150:497:HWH2WBGXC:4:23606:10802:17906 NODE_1404 1441 NODE_1814 4433 --
NS500150:497:HWH2WBGXC:4:23603:4102:4882 NODE_522 6855 NODE_1404 1035 --
This fails:
pip install pairtools==1.1.1 ## pairtools 1.1.0 errors with `circular import`
pairtools sort tmp.pairs
## pairs format v1.0
#sorted: readID
#shape: upper triangle
#chromsize: NODE_522 22786
#chromsize: NODE_1404 15015
#chromsize: NODE_1814 13236
#columns: readID chr1 pos1 chr2 pos2 strand1 strand2
Traceback (most recent call last):
File "/home/rsg/micromamba/envs/metator/bin/pairtools", line 8, in <module>
sys.exit(cli())
File "/home/rsg/micromamba/envs/metator/lib/python3.10/site-packages/click/core.py", line 1161, in __call__
return self.main(*args, **kwargs)
File "/home/rsg/micromamba/envs/metator/lib/python3.10/site-packages/click/core.py", line 1082, in main
rv = self.invoke(ctx)
File "/home/rsg/micromamba/envs/metator/lib/python3.10/site-packages/click/core.py", line 1697, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/rsg/micromamba/envs/metator/lib/python3.10/site-packages/click/core.py", line 1443, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/rsg/micromamba/envs/metator/lib/python3.10/site-packages/click/core.py", line 788, in invoke
return __callback(*args, **kwargs)
File "/home/rsg/micromamba/envs/metator/lib/python3.10/site-packages/pairtools/cli/__init__.py", line 183, in wrapper
return func(*args, **kwargs)
File "/home/rsg/micromamba/envs/metator/lib/python3.10/site-packages/pairtools/cli/sort.py", line 128, in sort
sort_py(
File "/home/rsg/micromamba/envs/metator/lib/python3.10/site-packages/pairtools/cli/sort.py", line 199, in sort_py
colindex = int(col) if col.isnumeric() else column_names.index(col) + 1
ValueError: 'chrom1' is not in list
- Now, changing the
chr1
/chr2
tochrom1
/chrom2
in the header:
echo -e "## pairs format v1.0
#columns: readID chrom1 pos1 chrom2 pos2 strand1 strand2
#sorted: readID
#shape: upper triangle
#chromsize: NODE_522 22786
#chromsize: NODE_1404 15015
#chromsize: NODE_1814 13236
NS500150:497:HWH2WBGXC:4:23605:21900:3336\tNODE_1404\t461\tNODE_1404\t246\t --
NS500150:497:HWH2WBGXC:4:23603:4102:4882\tNODE_522\t6855\tNODE_1404\t1035\t--
NS500150:497:HWH2WBGXC:4:23606:10802:17906\tNODE_1404\t1441\tNODE_1814\t4433\t--" > tmp2.pairs
This works:
pip install pairtools==1.0.3
pairtools sort tmp2.pairs
# sorted pairs...
This fails:
pip install pairtools==1.1.1
pairtools sort tmp2.pairs
## pairs format v1.0
#sorted: readID
#shape: upper triangle
#chromsize: NODE_522 22786
#chromsize: NODE_1404 15015
#chromsize: NODE_1814 13236
#columns: readID chr1 pos1 chr2 pos2 strand1 strand2
Traceback (most recent call last):
File "/home/rsg/micromamba/envs/metator/bin/pairtools", line 8, in <module>
sys.exit(cli())
File "/home/rsg/micromamba/envs/metator/lib/python3.10/site-packages/click/core.py", line 1161, in __call__
return self.main(*args, **kwargs)
File "/home/rsg/micromamba/envs/metator/lib/python3.10/site-packages/click/core.py", line 1082, in main
rv = self.invoke(ctx)
File "/home/rsg/micromamba/envs/metator/lib/python3.10/site-packages/click/core.py", line 1697, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/rsg/micromamba/envs/metator/lib/python3.10/site-packages/click/core.py", line 1443, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/rsg/micromamba/envs/metator/lib/python3.10/site-packages/click/core.py", line 788, in invoke
return __callback(*args, **kwargs)
File "/home/rsg/micromamba/envs/metator/lib/python3.10/site-packages/pairtools/cli/__init__.py", line 183, in wrapper
return func(*args, **kwargs)
File "/home/rsg/micromamba/envs/metator/lib/python3.10/site-packages/pairtools/cli/sort.py", line 128, in sort
sort_py(
File "/home/rsg/micromamba/envs/metator/lib/python3.10/site-packages/pairtools/cli/sort.py", line 199, in sort_py
colindex = int(col) if col.isnumeric() else column_names.index(col) + 1
ValueError: 'pair_type' is not in list
- Now, adding
pair_type
:
echo -e "## pairs format v1.0
#columns: readID chrom1 pos1 chrom2 pos2 strand1 strand2 pair_type
#sorted: readID
#shape: upper triangle
#chromsize: NODE_522 22786
#chromsize: NODE_1404 15015
#chromsize: NODE_1814 13236
NS500150:497:HWH2WBGXC:4:23605:21900:3336\tNODE_1404\t461\tNODE_1404\t246\t --
NS500150:497:HWH2WBGXC:4:23603:4102:4882\tNODE_522\t6855\tNODE_1404\t1035\t--
NS500150:497:HWH2WBGXC:4:23606:10802:17906\tNODE_1404\t1441\tNODE_1814\t4433\t--" > tmp3.pairs
This works:
pip install pairtools==1.0.3
pairtools sort tmp3.pairs
# sorted pairs...
This works:
pip install pairtools==1.1.1
pairtools sort tmp3.pairs
## pairs format v1.0
#sorted: readID
#shape: upper triangle
#chromsize: NODE_522 22786
#chromsize: NODE_1404 15015
#chromsize: NODE_1814 13236
#columns: readID chrom1 pos1 chrom2 pos2 strand1 strand2 pair_type
NS500150:497:HWH2WBGXC:4:23605:21900:3336 NODE_1404 461 NODE_1404 246 --
NS500150:497:HWH2WBGXC:4:23606:10802:17906 NODE_1404 1441 NODE_1814 4433 --
NS500150:497:HWH2WBGXC:4:23603:4102:4882 NODE_522 6855 NODE_1404 1035 --
Metadata
Metadata
Assignees
Labels
No labels