-
Notifications
You must be signed in to change notification settings - Fork 82
Unexpected results in subset-based alignment and alignment against reference #795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I implemented an alternative version that allows to switch the interpolation method: .applyRtAdjustment <- function(x, rtraw, rtadj,
method = c("stepfun", "approxfun")) {
method <- match.arg(method)
## re-order everything if rtraw is not sorted; issue #146
if (is.unsorted(rtraw)) {
idx <- order(rtraw)
rtraw <- rtraw[idx]
rtadj <- rtadj[idx]
}
if (method == "approxfun")
adjFun <- approxfun(rtraw, rtadj)
else
adjFun <- stepfun(rtraw[-1] - diff(rtraw) / 2, rtadj)
res <- adjFun(x)
## Fix margins.
idx_low <- which(x < rtraw[1])
if (length(idx_low)) {
first_adj <- idx_low[length(idx_low)] + 1
res[idx_low] <- x[idx_low] + res[first_adj] - x[first_adj]
}
idx_high <- which(x > rtraw[length(rtraw)])
if (length(idx_high)) {
last_adj <- idx_high[1] - 1
res[idx_high] <- x[idx_high] + res[last_adj] - x[last_adj]
}
if (is.null(dim(res)))
names(res) <- names(x)
res
} |
I could not reproduce the issue above using artificial data - but it seems to me the Simulating just a linear shift of values by a constant: a_raw <- seq(1, 100, by = 0.3)
a_adj <- a_raw + 0.2 Defining a vector that should be aligned based on the above alignment results. Note that the increment is the same ( x <- seq(1.4, 20, by = 0.3) Running the original method: res <- .applyRtAdjustment(x, a_raw, a_adj, method = "stepfun")
head(x)
[1] 1.4 1.7 2.0 2.3 2.6 2.9
head(res)
[1] 1.5 1.8 2.1 2.4 2.7 3.0 The difference between the raw and adjusted values is Using res <- .applyRtAdjustment(x, a_raw, a_adj, method = "approxfun")
head(res)
[1] 1.6 1.9 2.2 2.5 2.8 3.1 And also A second example adding also noise - but still using a constant shift by a_raw <- seq(0.1, 100, by = 0.03)
a_adj <- sort(a_raw + 0.2)
set.seed(123)
x <- seq(2.02, 90.02, by = 0.03)
x <- sort(x + rnorm(length(x), 0, 0.001)) Running res <- .applyRtAdjustment(x, a_raw, a_adj, method = "stepfun")
expect_equal(res, x + 0.2)
Error: `res` not equal to x + 0.2.
2934/2934 mismatches (average diff: 0.00079)
[1] 2.22 - 2.22 == 5.60e-04
[2] 2.25 - 2.25 == 2.30e-04
[3] 2.28 - 2.28 == -1.56e-03
[4] 2.31 - 2.31 == -7.05e-05
[5] 2.34 - 2.34 == -1.29e-04
[6] 2.37 - 2.37 == -1.72e-03
[7] 2.40 - 2.40 == -4.61e-04
[8] 2.43 - 2.43 == 1.27e-03
[9] 2.46 - 2.46 == 6.87e-04
... And plotting the data: plot(res, res - x, type = "l") So, seems we partially validate the observed scattering in the initial example. While when using res <- .applyRtAdjustment(x, a_raw, a_adj, method = "approxfun")
expect_equal(res, x + 0.2)
head(x)
[1] 2.019440 2.049770 2.081559 2.110071 2.140129 2.171715
head(res)
[1] 2.219440 2.249770 2.281559 2.310071 2.340129 2.371715 For these example it seems the |
- This fixes issue #795 by using `approxfun()` instead of `stepfun()` for the subset-based alignment (and only there!). For adjustment of chrom peaks' retention times we still use the `stepfun()` that will ensure the RT will coincide with the RT of an actual spectrum (not interpolated).
The fix is in the above commit. It only affects subset-based adjustment. For any other retention time mapping it still uses the original Before making a PR I will cross-check the results on various data sets. |
For some data sets we see a strange artifact when using subset-based alignment:
i.e., instead of the expected continuous, smooth line for each sample we see such random scattering for the retention time differences (rtadj - rtraw) for the samples that get aligned against the subset (smooth solid lines in the plot).
This seems to be specific to certain data sets as we don't see this e.g. on the data set used in the package vignette.
In subset-based alignment we use the internal
.applyRtAdjustment()
function that aligns a vector of values against reference values:My suspicion is the use of
stepfun()
here. I will check if we could not maybe useapproxfun()
with linear interpolation instead.The text was updated successfully, but these errors were encountered: