Skip to content

Handling discontinuous density functions #925

Open
@richardreeve

Description

@richardreeve

I'm thinking about problems with non-continuous distributions, and whether it really makes sense to only have Union{Discrete, Continuous} <: ValueSupport. There was a brief discussion about mixed discrete and continuous distributions in #332, then more recently a couple of attempts to deal with other problems with discrete distributions in #887 and #916 allowing non-integer support. However, it seems like none of them is going anywhere at the moment...

The problem I see is that as well as being the probability density function for continuous distributions, pdf is also the probability mass function for discrete distributions. I'd like to be able to define a slab-and-spike distribution (as mentioned in that first issue), but I can't see how to do it, not just because distributions in MixtureModels need to have all the same ValueSupport subtype, but because it's not clear what the ValueSupport should be, nor what the pdf function should return - if we treat it as a density function, it's infinite at point masses, whereas if we treat it as a mass function, then it's zero everywhere else.

My feeling is that there's a problem with reusing pdf in both cases... pmf seems like a better bet. This can obviously be done in a non-breaking way by aliasing pmf to pdf, but then the mixed distributions can define both. What do people think? What do these distributions actually have to provide?

The end goal of this from my perspective is to be able to construct zero-inflated (also mentioned in #390), hurdle and slab-and-spike distributions easily from their constituent parts (a point mass and another discrete or continuous distribution) as they all get used a reasonable amount in the real world... [in fact there isn’t even a point mass distribution at the moment I don’t think?] But it's also to clean up the handling of discontinuous-but-not-discrete and discrete distributions more generally.

On the latter note, shouldn't we allow distributions with Discrete ValueSupport to have elements of any type, so long as there are a countably infinite(?) number of elements in its support (rather than requiring Ints - or things that round or floor to Ints, which is even more bizarre - as now)? Perhaps we should have CountableValue{T} <: DiscontinuousValue{T} <: ValueSupport where T is currently an Int - aliasing Discrete to CountableValue{Int} - but #916 suggests extending T in CountableValue{T} to Any? T in Discontinuous{T} could then be Float64 for a slab-spike distribution, and you could even have ContinuousValue{T} <: ValueSupport (with Continuous aliased to ContinuousValue{Float64}) for distributions with complex support (or Float32, etc.). And then finally what about more complicated support (say for a tree?) being explicitly handled by either a subtype of DiscontinuousValue{T} or a separate CompoundValue{T}?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions