-
Notifications
You must be signed in to change notification settings - Fork 23
Description
Problem:
Uphyloplot2 output CNV_files have several cell groups with multiple letter classifications. I don't think this should happen, and I don't know how to interpret this. The .svg plots fine, it's just the CNV_files .csv that don't seem to match.
In the test data that comes with Uphyloplot2, each group gets a single letter classification.
However, in my actual data, I get multiple letters assigned to each cell group for every sample, like the following:
1,0.0
1.1,0.0,B
1.1.1,0.0,C
1.1.1.1,8.454810495626822,D
1.1.1.2,7.38581146744412,E,D
1.1.2,0.0,F,E
1.1.2.1,20.89407191448008,G,F
1.1.2.2,55.393586005830905,H,G
1.2,0.0,I,H
1.2.1,0.0,J,I
I'm using InferCNV (v_1.18.1) with the Leiden model, so in order to get the cell groupings, I'm using the plot_cnv() function with write_phylo = TRUE to get a dendrogram.txt file. I run uphyloplot2's newick_input.py on the dedrogram file to get the cell_groupings. I do the post-processing to subset the cells to just the tumor cells, split by sample, then run uphyloplot2.py, getting the results like shown above for any given sample.
But since the test data and other publications using this program only show a single letter classification for each group, I don't understand why I'm getting dual-letter classifications or how to interpret it. What's going on? What could cause this? Especially when the .svg plot seems fine?
Other things I've tried:
- Reading the InferCNV dendrogram file into the phylogram library, and using phylogram to write the dendrogram file back out. (I thought perhaps there was some kind of dendrogram formatting issue from InferCNV. But even when phylogram writes the dendrogram file, I get the same results as above.
- I noticed that my Barcodes, which contained dashes and numbers at the end like 'ATCGATCGATCG-1', were getting trimmed off by uphyloplot2. But trimming the barcodes myself (confirming all were still unique even after trimming) and re-running still resulted in the same results as above. So, while there was a barcode issue, it's not causing the problem.