Optimization Opportunity: PSADBW and VPDPBUSD

A good amount of silicon goes to video processing compression and more recently to neural network inference: this gives us very fast compound instructions.
An example of the former is https://www.felixcloutier.com/x86/psadbw which computes the L1 norm between two byte-vectors.
The latter, part of VNNI, is https://www.felixcloutier.com/x86/vpdpbusd which is a convolution of a byte vector with an extra addition.

They're similar in that they compute similarity, and I believe this can be exploited in the `window` threshold-generalization and metrics like `jaccard`, `cosine`, and `mutual_information`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimization Opportunity: PSADBW and VPDPBUSD #11

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Optimization Opportunity: PSADBW and VPDPBUSD #11

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions