Skip to content

PERF/API: arithmetic result.index.dtype with mismatched-dtype equal-values indexes #63417

@jbrockmendel

Description

@jbrockmendel

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

dti = pd.date_range("2016-01-01", periods=3)
ser = pd.Series(1, index=dti)
ser2 = pd.Series(1, index=dti.as_unit("ns"))
ser3 = pd.Series(1, index=ser2.index.as_unit("us"))

ser == ser2  # <-- raises bc these are not "identically-labelled"

In [6]: %timeit ser + ser2
73.9 μs ± 3.12 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [8]: %timeit ser + ser3
17 μs ± 474 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

Issue Description

If we have two Series with indexes with mismatched dt64 unit but equal values, a check in Series.align for self.index.equals(other.index) returns False. This leads to an expensive cast. This could be avoided by patching DatetimeArray.equals to check for equal-values*. xref #33940.

This would be a win perf-wise, but the downside would be that the result.index.dtype would no longer be commutative, which I think in previous discussions we've considered desirable.

* We could actually improve perf even more in cases with a non-None .freq by short-circuiting in cases with matching freq and matching self[0] == other[0].

Expected Behavior

N/A.

Installed Versions

Details

Replace this line with the output of pd.show_versions()

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugNeeds TriageIssue that has not been reviewed by a pandas team member

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions