Skip to content

Conversation

@AliAlimohammadi
Copy link
Contributor

Description

Adds an implementation of the Burrows-Wheeler Transform (BWT), a block-sorting compression algorithm that rearranges strings into runs of similar characters.

The BWT is particularly useful for compression as it:

  • Groups similar characters together (making run-length encoding more effective)
  • Is completely reversible without storing additional data (except the index)
  • Serves as preprocessing for compression algorithms like bzip2

Example

For the string "^BANANA":

  • BWT produces: "BNN^AAA" with index 6
  • Notice how the A's are grouped together, making it easier to compress
  • The transform is fully reversible: reverse_bwt("BNN^AAA", 6)"^BANANA"

Algorithm Overview

Forward Transform (bwt_transform):

  1. Generate all rotations of the input string
  2. Sort the rotations lexicographically
  3. Take the last character of each sorted rotation to form the BWT string
  4. Record the position of the original string in the sorted list

Reverse Transform (reverse_bwt):

  1. Start with an empty table of strings
  2. Iteratively prepend BWT characters and sort
  3. After n iterations (where n is the string length), extract the original string at the recorded index

Time Complexity:

  • Forward: O(n² log n) where n is string length
  • Reverse: O(n² log n)

Space Complexity: O(n²)

Changes Made

  • Added src/compression/burrows_wheeler_transform.rs
  • Added module declaration and export in src/compression/mod.rs
  • Includes BwtResult struct for returning transform results
  • Three public functions:
    • all_rotations - generates all rotations of a string
    • bwt_transform - performs the BWT
    • reverse_bwt - reverses the BWT

Testing

All tests pass:

cargo test burrows_wheeler_transform

Test coverage includes:

  • ✅ Multiple example strings (BANANA, panamabanana, etc.)
  • ✅ Roundtrip tests (transform → reverse → original)
  • ✅ Edge cases (single character, repeated characters)
  • ✅ Error handling (empty strings, invalid indices)
  • ✅ All examples from the Python implementation

Real-World Usage

The BWT is used in:

  • bzip2 compression algorithm
  • Bioinformatics for DNA sequence alignment (FM-index)
  • Data compression as preprocessing step

References

Checklist

  • Code follows repository style guidelines
  • All tests pass
  • Documentation includes examples
  • Functions are publicly exported
  • No compiler warnings
  • Roundtrip property verified in tests

@codecov-commenter
Copy link

codecov-commenter commented Dec 24, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.75%. Comparing base (b6a0787) to head (872f574).

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #977      +/-   ##
==========================================
+ Coverage   95.73%   95.75%   +0.02%     
==========================================
  Files         349      350       +1     
  Lines       22774    22892     +118     
==========================================
+ Hits        21802    21921     +119     
+ Misses        972      971       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@AliAlimohammadi
Copy link
Contributor Author

@siriak, this is ready to be merged. Clippy error is a CI issue. No problem on other machines.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an implementation of the Burrows-Wheeler Transform (BWT) algorithm, a block-sorting compression technique that rearranges strings into runs of similar characters to improve compressibility. The implementation includes both forward and reverse transforms with comprehensive test coverage.

Key changes:

  • Implements the BWT algorithm with forward transform (bwt_transform) and reverse transform (reverse_bwt)
  • Provides a utility function (all_rotations) to generate all rotations of a string
  • Includes comprehensive tests for various input strings and edge cases

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File Description
src/compression/burrows_wheeler_transform.rs New file implementing the BWT algorithm with forward/reverse transforms, a BwtResult struct, and helper functions with comprehensive tests
src/compression/mod.rs Adds module declaration and exports for the new BWT implementation
DIRECTORY.md Adds documentation link for the new Burrows-Wheeler Transform algorithm in alphabetical order

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Member

@siriak siriak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

@siriak siriak merged commit b77770e into TheAlgorithms:master Dec 25, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants