Description
The main model spec defines a list of allowed Model Type values that are designed to capture the type of model in more detail than than the Algorithm Type.
Opening this issue to discuss the following:
- Is "Model Type" is the appropriate name for this field?
- Should there be any changes to the listed values?
- Should we allow other, user-defined values or require one of the listed values?
- Should we be more explicit about relationships between this field and the Algorithm Type field?
originally posted by @duckontheweb
Comments
Is "Model Type" is the appropriate name for this field?
It seems reasonable! "Target type" might be another appropriate name.
Two thoughts here:
Multi-task models do not fit well. For example, models trained on the xBD dataset (https://arxiv.org/pdf/1911.09296.pdf) might need to segment buildings and predict image level damage labels. Here a model might be segmentation and classification!
Unsupervised models that use some sort of contrastive loss also do not fit well. For example, if an unsupervised model uses a triplet loss what should its "Model type" be?
Good points Caleb.
For the Multi-task models, the solution can be to allow "Model Type" to have a list of types.
For the unsupervised example, doesn't fall under "Dimensionality Reduction"? We can also add "Embedding", but it's kind of the same thing.
Also note that the value for "Model Type" has a set of pre-defined values but for edge cases like this modeler can use/suggest a different type.
For the unsupervised example, doesn't fall under "Dimensionality Reduction"? https://github.com/radiantearth/geo-ml-model-catalog/pull/21 changed the model_type field to a prediction_type field in the model_type object. We don't currently have "Dimensionality Reduction" listed as an option in the "Prediction Type" section, but maybe we should. @calebrob6 if we add this, it seems like your suggestion of using target_type might be more appropriate than using prediction_type since dimensionality reduction doesn't really represent a "prediction."
In computer vision, The term task, instead of model_type, is more usual in the ML field, regarding the the listed values, we can sometimes find also the term Instance detection / Instance segmentation.: