Skip to content

Remove generalized gramians #690

@ValerianRey

Description

@ValerianRey

The reasons are:

  • This interface is not coherent with the interface of autojac. autojac.jac computes jacobians of shape [m, n_1, ..., n_l], even if the output if of shape [m_1, m_2], with m_1 * m_2 = m. So the coherent autogram behavior should be to compute gramians of shape [m, m] if the output is of shape [m_1, m_2].
  • Changing autojac to make it coherent with autogram is extremely tedious: it would return jacobians of shape [m_1, ..., m_k, n_1, ..., n_l], which means that it's hard to know which dimensions are for the objectives and which dimensions are for the parameters. If we were to give such matrices to aggregators, we would also need to give them k, l or both. If we instead one day give jacobians of shape [m, n_1, ..., n_l] to aggregators, it will work without any problem.
  • Even in autogram, this only works if the output is a single tensor. As we've discussed a while ago, if the output is e.g. two tensors of shape [m_1, m_2] and [m_3, m_4], finding the corresponding generalized gramians is a nightmare which we don't support and never will. Even if we managed to support that, making the corresponding generalized weightings would also be a nightmare, and explaining that to the users would basically be impossible.
  • The concept of generalized gramian, GeneralizedWeighting and Flattening are extra classes that can confuse the user. If we also go down the path of having autojac coherent with autogram (i.e. by making it return jacobians of shape [m_1, ..., m_k, n_1, ..., n_l]) we would even need the concept of generalized jacobian, GeneralizedAggregators, and so on. It would become too much extremely quickly.
  • It's actually possible to do IWMTL without all this. The user has the information of the shape of their output (e.g. [m_1, ..., m_k]), so if they really want to, they can themselves reshape their [m, m] gramian (again, with m = m_1 * ... * m_k) into a tensor of shape [m_1, ..., m_k, m_k, ..., m_1], and do whatever strategy they want to aggregate that (other than flattening of course, which they could have done without even doing this reshapping). So it's not like we're more expressive by supporting this: we're just taking a responsibility away from the user, but in a very incomplete way and while adding a lot of bloat to the library.
  • In autogram's internals, we can still work with generalized gramians if it's easier, as long as we don't return them to the users.

So in my opinion, we should get rid of the concept of generalized gramian and PSDTensor (except maybe internally in autogram), of GeneralizedWeighting, and of Flattening, and make the necessary transition (update IWMTL example, tell users how to make the change in changelog, etc etc)

Metadata

Metadata

Assignees

No one assigned

    Labels

    breaking-changeThis PR introduces a breaking change.cc: refactorConventional commit type for any refactoring, not user-facing, and not typing or perf improvementspackage: aggregationpackage: autogram

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions