Skip to content

Add APIs for case folding to the standard library#154742

Open
Jules-Bertholet wants to merge 3 commits intorust-lang:mainfrom
Jules-Bertholet:casefold
Open

Add APIs for case folding to the standard library#154742
Jules-Bertholet wants to merge 3 commits intorust-lang:mainfrom
Jules-Bertholet:casefold

Conversation

@Jules-Bertholet
Copy link
Copy Markdown
Contributor

Libs-api requested these, so here they are.

New public API (gated behind #[feature(casefold)]):

impl char {
    pub fn to_casefold(self) -> ToCasefold;
}

impl str {
    pub fn to_casefold(&self) -> String;
    pub fn eq_ignore_case(&self) -> bool;
}

pub struct ToCasefold { ... }
impl Iterator for ToCasefold { type Item = char; ... }
impl DoubleEndedIterator for ToCasefold { ... }
impl FusedIterator for ToCasefold { }
impl ExactSizeIterator for ToCasefold { ... }
impl fmt::Display for ToCasefold { ... }

Notes

  • This only adds a negligible amount of static data to core::unicode. To accomplish that, we compute the case-folding for most characters as the lowercase of their uppercase; this double mapping adds some complexity to the implementation.
  • No normalization (e.g. NFC) is performed, so visually and semantically equivalent strings can compare unequal.
  • I have not put any effort into optimizing eq_ignore_case(); there may be a more performant implementation.
  • char::eq_ignore_case() is left to future work—it's a potential footgun, so we may want to think more deeply about how to expose and document that API.

@rustbot label T-libs-api A-unicode

@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 3, 2026

These commits modify the Cargo.lock file. Unintentional changes to Cargo.lock can be introduced when switching branches and rebasing PRs.

If this was unintentional then you should revert the changes before this PR is merged.
Otherwise, you can ignore this comment.

library/core/src/unicode/unicode_data.rs is generated by the src/tools/unicode-table-generator tool.

If you want to modify unicode_data.rs, please modify the tool then regenerate the library source file via ./x run src/tools/unicode-table-generator instead of editing unicode_data.rs manually.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Apr 3, 2026
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 3, 2026

r? @scottmcm

rustbot has assigned @scottmcm.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Why was this reviewer chosen?

The reviewer was selected based on:

  • Owners of files modified in this PR: @scottmcm, libs
  • @scottmcm, libs expanded to 8 candidates
  • Random selection from Mark-Simulacrum, jhpratt, scottmcm

@rustbot rustbot added A-Unicode Area: Unicode T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Apr 3, 2026
@rust-log-analyzer

This comment has been minimized.

@Jules-Bertholet Jules-Bertholet force-pushed the casefold branch 2 times, most recently from 5b5e617 to bf4ee7c Compare April 3, 2026 13:25
@scottmcm
Copy link
Copy Markdown
Member

@rustbot reroll

@rustbot rustbot assigned jhpratt and unassigned scottmcm Apr 16, 2026
@rust-bors

This comment has been minimized.

@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 16, 2026

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Unicode Area: Unicode S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants