-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
Description
Feature Type
- Adding new functionality to pandas
- Changing existing functionality in pandas
- Removing existing functionality in pandas
Problem Description
Pandas currently has no way to read or write data in the Vortex format (https://lf-ai.github.io/vortex/), a new open columnar storage format developed under the Linux Foundation. As Vortex adoption grows, pandas users cannot easily load these datasets without first converting them to Parquet or Arrow, adding unnecessary steps and limiting interoperability.
Feature Description
Add optional pandas IO support for Vortex, similar to existing Parquet and Arrow readers/writers. This could include:
pd.read_vortex(path, **kwargs)DataFrame.to_vortex(path, **kwargs)
Support may rely on a Python Vortex library or an Arrow-based bridge if one becomes available.
Alternative Solutions
- Convert Vortex → Parquet or Arrow → pandas
- Use custom Python wrappers
- Switch to other DataFrame libraries that add Vortex support
These options work but are less efficient and not user-friendly.
Additional Context
Vortex is designed for high-performance, compressed, columnar analytics and aligns well with Arrow-style data workflows. Native pandas interoperability would make it easier for the Python ecosystem to evaluate and adopt the format.