Skip to content

Use clustering in original space to create labels for metrics #8

@lazappi

Description

@lazappi

Is your feature request related to a problem? Please describe.

Dimensionality reduction for visualisation is different to most other tasks as it does not produce a specific result but is used for exploration at different stages of analysis. We are just as likely to visualise messy data with batch effects etc. as we are clean data we defined cell types and a good visualisation should represent whatever variation is in the dataset. Currently we use predefined labels (i.e. cell types) for some metrics but these may not be appropriate when there are other sources of variation in the data.

Describe the solution you'd like

Instead of using predefined labels we could instead create labels for metrics using a standard pipeline (HVGs, PCA, neighbours, Louvain) with a highish resolution as part of the data preparation step. These labels should then capture whatever variation is present in the dataset (batches, cell types/states, ...) in an unbiased way and metrics can check if that is preserved in the low-dimensional space.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Keeping the current use of provided labels (which are not ideal for this case) or limiting metric to those that don't require labels.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions