-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Add Burrows-Wheeler Transform algorithm #977
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Burrows-Wheeler Transform algorithm #977
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #977 +/- ##
==========================================
+ Coverage 95.73% 95.75% +0.02%
==========================================
Files 349 350 +1
Lines 22774 22892 +118
==========================================
+ Hits 21802 21921 +119
+ Misses 972 971 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@siriak, this is ready to be merged. Clippy error is a CI issue. No problem on other machines. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds an implementation of the Burrows-Wheeler Transform (BWT) algorithm, a block-sorting compression technique that rearranges strings into runs of similar characters to improve compressibility. The implementation includes both forward and reverse transforms with comprehensive test coverage.
Key changes:
- Implements the BWT algorithm with forward transform (
bwt_transform) and reverse transform (reverse_bwt) - Provides a utility function (
all_rotations) to generate all rotations of a string - Includes comprehensive tests for various input strings and edge cases
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| src/compression/burrows_wheeler_transform.rs | New file implementing the BWT algorithm with forward/reverse transforms, a BwtResult struct, and helper functions with comprehensive tests |
| src/compression/mod.rs | Adds module declaration and exports for the new BWT implementation |
| DIRECTORY.md | Adds documentation link for the new Burrows-Wheeler Transform algorithm in alphabetical order |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
siriak
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks!
Description
Adds an implementation of the Burrows-Wheeler Transform (BWT), a block-sorting compression algorithm that rearranges strings into runs of similar characters.
The BWT is particularly useful for compression as it:
Example
For the string
"^BANANA":"BNN^AAA"with index6reverse_bwt("BNN^AAA", 6)→"^BANANA"Algorithm Overview
Forward Transform (
bwt_transform):Reverse Transform (
reverse_bwt):niterations (wherenis the string length), extract the original string at the recorded indexTime Complexity:
Space Complexity: O(n²)
Changes Made
src/compression/burrows_wheeler_transform.rssrc/compression/mod.rsBwtResultstruct for returning transform resultsall_rotations- generates all rotations of a stringbwt_transform- performs the BWTreverse_bwt- reverses the BWTTesting
All tests pass:
cargo test burrows_wheeler_transformTest coverage includes:
Real-World Usage
The BWT is used in:
References
Checklist