Is your feature request related to a problem? Please describe.
Dimensionality reduction for visualisation is different to most other tasks as it does not produce a specific result but is used for exploration at different stages of analysis. We are just as likely to visualise messy data with batch effects etc. as we are clean data we defined cell types and a good visualisation should represent whatever variation is in the dataset. Currently we use predefined labels (i.e. cell types) for some metrics but these may not be appropriate when there are other sources of variation in the data.
Describe the solution you'd like
Instead of using predefined labels we could instead create labels for metrics using a standard pipeline (HVGs, PCA, neighbours, Louvain) with a highish resolution as part of the data preparation step. These labels should then capture whatever variation is present in the dataset (batches, cell types/states, ...) in an unbiased way and metrics can check if that is preserved in the low-dimensional space.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Keeping the current use of provided labels (which are not ideal for this case) or limiting metric to those that don't require labels.
Is your feature request related to a problem? Please describe.
Dimensionality reduction for visualisation is different to most other tasks as it does not produce a specific result but is used for exploration at different stages of analysis. We are just as likely to visualise messy data with batch effects etc. as we are clean data we defined cell types and a good visualisation should represent whatever variation is in the dataset. Currently we use predefined labels (i.e. cell types) for some metrics but these may not be appropriate when there are other sources of variation in the data.
Describe the solution you'd like
Instead of using predefined labels we could instead create labels for metrics using a standard pipeline (HVGs, PCA, neighbours, Louvain) with a highish resolution as part of the data preparation step. These labels should then capture whatever variation is present in the dataset (batches, cell types/states, ...) in an unbiased way and metrics can check if that is preserved in the low-dimensional space.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Keeping the current use of provided labels (which are not ideal for this case) or limiting metric to those that don't require labels.