Skip to content

Commit f9fa158

Browse files
fangchenliclaude
andcommitted
PERF: use PyArrow-native implementation for dt.total_seconds
Avoid conversion to TimedeltaArray by using PyArrow compute directly. Cast duration to int64, then to float64, and multiply by unit factor. ~3.7x speedup (3.53ms -> 0.96ms for 1M rows). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
1 parent 9ee361b commit f9fa158

File tree

1 file changed

+7
-1
lines changed

1 file changed

+7
-1
lines changed

pandas/core/arrays/arrow/array.py

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2915,7 +2915,13 @@ def _dt_to_pytimedelta(self) -> np.ndarray:
29152915
return np.array(data, dtype=object)
29162916

29172917
def _dt_total_seconds(self) -> Self:
2918-
result = pa.array(self._to_timedeltaarray().total_seconds(), from_pandas=True)
2918+
# Convert duration to seconds using PyArrow compute
2919+
# Must cast to int64 first since duration -> float64 is not supported
2920+
unit = self._pa_array.type.unit
2921+
unit_to_seconds = {"s": 1.0, "ms": 1e-3, "us": 1e-6, "ns": 1e-9}
2922+
factor = unit_to_seconds[unit]
2923+
int_arr = pc.cast(self._pa_array, pa.int64())
2924+
result = pc.multiply(pc.cast(int_arr, pa.float64()), factor)
29192925
return self._from_pyarrow_array(result)
29202926

29212927
def _dt_as_unit(self, unit: str) -> Self:

0 commit comments

Comments
 (0)