-
-
Notifications
You must be signed in to change notification settings - Fork 1k
feat: add stats/strided/distances/dcosine-similarity
#9281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Changes from all commits
27e3f0b
4d1a80e
74ce3de
39c1208
9a3e999
7fbf16c
b9d9cc6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,338 @@ | ||
| <!-- | ||
|
|
||
| @license Apache-2.0 | ||
|
|
||
| Copyright (c) 2025 The Stdlib Authors. | ||
|
|
||
| Licensed under the Apache License, Version 2.0 (the "License"); | ||
| you may not use this file except in compliance with the License. | ||
| You may obtain a copy of the License at | ||
|
|
||
| http://www.apache.org/licenses/LICENSE-2.0 | ||
|
|
||
| Unless required by applicable law or agreed to in writing, software | ||
| distributed under the License is distributed on an "AS IS" BASIS, | ||
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| See the License for the specific language governing permissions and | ||
| limitations under the License. | ||
|
|
||
| --> | ||
|
|
||
| # dcosineSimilarity | ||
|
|
||
| > Compute the cosine similarity of two double-precision floating-point strided arrays. | ||
|
|
||
| <section class="intro"> | ||
|
|
||
| The [cosine similarity][wikipedia-cosine-similarity] is defined as | ||
|
|
||
| <!-- <equation class="equation" label="eq:cosine_similarity" align="center" raw="sim=\frac{A\cdot B}{\|A\|\|B\|} alt="Equation for cosine similarity."> --> | ||
|
|
||
| ```math | ||
| sim = \frac{A \cdot B}{\|A\| \, \|B\|} | ||
| ``` | ||
|
|
||
| <!-- <div class="equation" align="center" data-raw-text="sim=\frac{A\cdot B}{\|A\|\|B\|} data-equation="eq:cosine_similarity"> <img src="" alt="Cosine similarity equation."> <br> | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This equation markup is not correct, but it looks like I will need to update.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you please point me to the correct markup? I did not really get you. :) |
||
|
|
||
| </div> --> | ||
|
|
||
| <!-- </equation> --> | ||
|
|
||
| </section> | ||
|
|
||
| <!-- /.intro --> | ||
|
|
||
| <section class="usage"> | ||
|
|
||
| ## Usage | ||
|
|
||
| ```javascript | ||
| var dcosineSimilarity = require( '@stdlib/stats/strided/distances/dcosine-similarity' ); | ||
| ``` | ||
|
|
||
| #### dcosineSimilarity( N, x, strideX, y, strideY ) | ||
|
|
||
| Computes the cosine similarity of two double-precision floating-point strided arrays. | ||
|
|
||
| ```javascript | ||
| var Float64Array = require( '@stdlib/array/float64' ); | ||
|
|
||
| var x = new Float64Array( [ 4.0, 2.0, -3.0, 5.0, -1.0 ] ); | ||
| var y = new Float64Array( [ 2.0, 6.0, -1.0, -4.0, 8.0 ] ); | ||
|
|
||
| var z = dcosineSimilarity( x.length, x, 1, y, 1 ); | ||
| // returns ~-0.061 | ||
| ``` | ||
|
|
||
| The function has the following parameters: | ||
|
|
||
| - **N**: number of indexed elements. | ||
| - **x**: input [`Float64Array`][@stdlib/array/float64]. | ||
| - **strideX**: stride length of `x`. | ||
| - **y**: input [`Float64Array`][@stdlib/array/float64]. | ||
| - **strideY**: stride length of `y`. | ||
|
|
||
| The `N` and stride parameters determine which elements in the strided arrays are accessed at runtime. For example, to calculate the cosine similarity of every other value in `x` and the first `N` elements of `y` in reverse order, | ||
|
|
||
| ```javascript | ||
| var Float64Array = require( '@stdlib/array/float64' ); | ||
|
|
||
| var x = new Float64Array( [ 1.0, 2.0, 3.0, 4.0, 5.0, 6.0 ] ); | ||
| var y = new Float64Array( [ 1.0, 1.0, 1.0, 1.0, 1.0, 1.0 ] ); | ||
|
|
||
| var z = dcosineSimilarity( 3, x, 2, y, -1 ); | ||
| // returns ~0.878 | ||
| ``` | ||
|
|
||
| Note that indexing is relative to the first index. To introduce an offset, use [`typed array`][mdn-typed-array] views. | ||
|
|
||
| <!-- eslint-disable stdlib/capitalized-comments --> | ||
|
|
||
| ```javascript | ||
| var Float64Array = require( '@stdlib/array/float64' ); | ||
|
|
||
| // Initial arrays... | ||
| var x0 = new Float64Array( [ 1.0, 2.0, 3.0, 4.0, 5.0, 6.0 ] ); | ||
| var y0 = new Float64Array( [ 7.0, 8.0, 9.0, 10.0, 11.0, 12.0 ] ); | ||
|
|
||
| // Create offset views... | ||
| var x1 = new Float64Array( x0.buffer, x0.BYTES_PER_ELEMENT*1 ); // start at 2nd element | ||
| var y1 = new Float64Array( y0.buffer, y0.BYTES_PER_ELEMENT*3 ); // start at 4th element | ||
|
|
||
| var z = dcosineSimilarity( 3, x1, 1, y1, 1 ); | ||
| // returns ~0.982 | ||
| ``` | ||
|
|
||
| #### dcosineSimilarity.ndarray( N, x, strideX, offsetX, y, strideY, offsetY ) | ||
|
|
||
| Computes the cosine similarity of two double-precision floating-point strided arrays using alternative indexing semantics. | ||
|
|
||
| ```javascript | ||
| var Float64Array = require( '@stdlib/array/float64' ); | ||
|
|
||
| var x = new Float64Array( [ 4.0, 2.0, -3.0, 5.0, -1.0 ] ); | ||
| var y = new Float64Array( [ 2.0, 6.0, -1.0, -4.0, 8.0 ] ); | ||
|
|
||
| var z = dcosineSimilarity.ndarray( x.length, x, 1, 0, y, 1, 0 ); | ||
| // returns ~-0.061 | ||
| ``` | ||
|
|
||
| The function has the following additional parameters: | ||
|
|
||
| - **offsetX**: starting index for `x`. | ||
| - **offsetY**: starting index for `y`. | ||
|
|
||
| While [`typed array`][mdn-typed-array] views mandate a view offset based on the underlying buffer, the offset parameters support indexing semantics based on starting indices. For example, to calculate the cosine similarity of every other value in `x` starting from the second value with the last 3 elements in `y` in reverse order | ||
|
|
||
| ```javascript | ||
| var Float64Array = require( '@stdlib/array/float64' ); | ||
|
|
||
| var x = new Float64Array( [ 1.0, 2.0, 3.0, 4.0, 5.0, 6.0 ] ); | ||
| var y = new Float64Array( [ 7.0, 8.0, 9.0, 10.0, 11.0, 12.0 ] ); | ||
|
|
||
| var z = dcosineSimilarity.ndarray( 3, x, 2, 1, y, -1, y.length-1 ); | ||
| // returns ~0.895 | ||
| ``` | ||
|
|
||
| </section> | ||
|
|
||
| <!-- /.usage --> | ||
|
|
||
| <section class="notes"> | ||
|
|
||
| ## Notes | ||
|
|
||
| - If `N <= 0`, both functions return `0.0`. | ||
|
|
||
| </section> | ||
|
|
||
| <!-- /.notes --> | ||
|
|
||
| <section class="examples"> | ||
|
|
||
| ## Examples | ||
|
|
||
| <!-- eslint no-undef: "error" --> | ||
|
|
||
| ```javascript | ||
| var discreteUniform = require( '@stdlib/random/array/discrete-uniform' ); | ||
| var dcosineSimilarity = require( '@stdlib/stats/strided/distances/dcosine-similarity' ); | ||
|
|
||
| var opts = { | ||
| 'dtype': 'float64' | ||
| }; | ||
| var x = discreteUniform( 10, 0, 100, opts ); | ||
| console.log( x ); | ||
|
|
||
| var y = discreteUniform( x.length, 0, 10, opts ); | ||
| console.log( y ); | ||
|
|
||
| var out = dcosineSimilarity.ndarray( x.length, x, 1, 0, y, -1, y.length-1 ); | ||
| console.log( out ); | ||
| ``` | ||
|
|
||
| </section> | ||
|
|
||
| <!-- /.examples --> | ||
|
|
||
| <!-- C interface documentation. --> | ||
|
|
||
| * * * | ||
|
|
||
| <section class="c"> | ||
|
|
||
| ## C APIs | ||
|
|
||
| <!-- Section to include introductory text. Make sure to keep an empty line after the intro `section` element and another before the `/section` close. --> | ||
|
|
||
| <section class="intro"> | ||
|
|
||
| </section> | ||
|
|
||
| <!-- /.intro --> | ||
|
|
||
| <!-- C usage documentation. --> | ||
|
|
||
| <section class="usage"> | ||
|
|
||
| ### Usage | ||
|
|
||
| ```c | ||
| #include "stdlib/stats/strided/distances/dcosine_similarity.h" | ||
| ``` | ||
|
|
||
| #### stdlib_strided_dcosine_similarity( N, \*X, strideX, \*Y, strideY ) | ||
|
|
||
| Computes the cosine similarity of two double-precision floating-point strided arrays. | ||
|
|
||
| ```c | ||
| const double x[] = { 4.0, 2.0, -3.0, 5.0, -1.0 }; | ||
| const double y[] = { 2.0, 6.0, -1.0, -4.0, 8.0 }; | ||
|
|
||
| double v = stdlib_strided_dcosine_similarity( 5, x, 1, y, 1 ); | ||
| // returns ~-0.061 | ||
| ``` | ||
|
|
||
| The function accepts the following arguments: | ||
|
|
||
| - **N**: `[in] CBLAS_INT` number of indexed elements. | ||
| - **X**: `[in] double*` first input array. | ||
| - **strideX**: `[in] CBLAS_INT` stride length of `X`. | ||
| - **Y**: `[in] double*` second input array. | ||
| - **strideY**: `[in] CBLAS_INT` stride length of `Y`. | ||
|
|
||
| ```c | ||
| double stdlib_strided_dcosine_similarity( const CBLAS_INT N, const double *X, const CBLAS_INT strideX, const double *Y, const CBLAS_INT strideY ); | ||
| ``` | ||
|
|
||
| <!--lint disable maximum-heading-length--> | ||
|
|
||
| #### stdlib_strided_dcosine_similarity_ndarray( N, \*X, strideX, offsetX, \*Y, strideY, offsetY ) | ||
|
|
||
| <!--lint enable maximum-heading-length--> | ||
|
|
||
| Computes the cosine similarity of two double-precision floating-point strided arrays using alternative indexing semantics. | ||
|
|
||
| ```c | ||
| const double x[] = { 4.0, 2.0, -3.0, 5.0, -1.0 }; | ||
| const double y[] = { 2.0, 6.0, -1.0, -4.0, 8.0 }; | ||
|
|
||
| double v = stdlib_strided_dcosine_similarity_ndarray( 5, x, -1, 4, y, -1, 4 ); | ||
| // returns ~0.061 | ||
| ``` | ||
|
|
||
| The function accepts the following arguments: | ||
|
|
||
| - **N**: `[in] CBLAS_INT` number of indexed elements. | ||
| - **X**: `[in] double*` first input array. | ||
| - **strideX**: `[in] CBLAS_INT` stride length of `X`. | ||
| - **offsetX**: `[in] CBLAS_INT` starting index for `X`. | ||
| - **Y**: `[in] double*` second input array. | ||
| - **strideY**: `[in] CBLAS_INT` stride length of `Y`. | ||
| - **offsetY**: `[in] CBLAS_INT` starting index for `Y`. | ||
|
|
||
| ```c | ||
| double stdlib_strided_dcosine_similarity_ndarray( const CBLAS_INT N, const double *X, const CBLAS_INT strideX, const CBLAS_INT offsetX, const double *Y, const CBLAS_INT strideY, const CBLAS_INT offsetY ); | ||
| ``` | ||
|
|
||
| </section> | ||
|
|
||
| <!-- /.usage --> | ||
|
|
||
| <!-- C API usage notes. Make sure to keep an empty line after the `section` element and another before the `/section` close. --> | ||
|
|
||
| <section class="notes"> | ||
|
|
||
| </section> | ||
|
|
||
| <!-- /.notes --> | ||
|
|
||
| <!-- C API usage examples. --> | ||
|
|
||
| <section class="examples"> | ||
|
|
||
| ### Examples | ||
|
|
||
| ```c | ||
| #include "stdlib/stats/strided/distances/dcosine_similarity.h" | ||
| #include <stdio.h> | ||
|
|
||
| int main( void ) { | ||
| // Create strided arrays: | ||
| const double x[] = { 1.0, -2.0, 3.0, -4.0, 5.0, -6.0, 7.0, -8.0 }; | ||
| const double y[] = { 1.0, -2.0, 3.0, -4.0, 5.0, -6.0, 7.0, -8.0 }; | ||
|
|
||
| // Specify the number of elements: | ||
| const int N = 8; | ||
|
|
||
| // Specify strides: | ||
| const int strideX = 1; | ||
| const int strideY = -1; | ||
|
|
||
| // Compute the cosine similarity of `x` and `y`: | ||
| double sim = stdlib_strided_dcosine_similarity( N, x, strideX, y, strideY ); | ||
|
|
||
| // Print the result: | ||
| printf( "cosine similarity: %lf\n", sim ); | ||
|
|
||
| // Compute the cosine similarity of `x` and `y` with offsets: | ||
| sim = stdlib_strided_dcosine_similarity_ndarray( N, x, strideX, 0, y, strideY, N-1 ); | ||
|
|
||
| // Print the result: | ||
| printf( "cosine similarity: %lf\n", sim ); | ||
| } | ||
| ``` | ||
|
|
||
| </section> | ||
|
|
||
| <!-- /.examples --> | ||
|
|
||
| </section> | ||
|
|
||
| <!-- /.c --> | ||
|
|
||
| <!-- Section for related `stdlib` packages. Do not manually edit this section, as it is automatically populated. --> | ||
|
|
||
| <section class="related"> | ||
|
|
||
| </section> | ||
|
|
||
| <!-- /.related --> | ||
|
|
||
| <!-- Section for all links. Make sure to keep an empty line after the `section` element and another before the `/section` close. --> | ||
|
|
||
| <section class="links"> | ||
|
|
||
| [@stdlib/array/float64]: https://github.com/stdlib-js/stdlib/tree/develop/lib/node_modules/%40stdlib/array/float64 | ||
|
|
||
| [mdn-typed-array]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/TypedArray | ||
|
|
||
| [wikipedia-cosine-similarity]: https://en.wikipedia.org/wiki/Cosine_similarity | ||
|
|
||
| <!-- <related-links> --> | ||
|
|
||
| <!-- </related-links> --> | ||
|
|
||
| </section> | ||
|
|
||
| <!-- /.links --> | ||
Uh oh!
There was an error while loading. Please reload this page.