Skip to content

ENH: Add read/write support for the Vortex columnar format #63418

@d33bs

Description

@d33bs

Feature Type

  • Adding new functionality to pandas
  • Changing existing functionality in pandas
  • Removing existing functionality in pandas

Problem Description

Pandas currently has no way to read or write data in the Vortex format (https://lf-ai.github.io/vortex/), a new open columnar storage format developed under the Linux Foundation. As Vortex adoption grows, pandas users cannot easily load these datasets without first converting them to Parquet or Arrow, adding unnecessary steps and limiting interoperability.

Feature Description

Add optional pandas IO support for Vortex, similar to existing Parquet and Arrow readers/writers. This could include:

  • pd.read_vortex(path, **kwargs)
  • DataFrame.to_vortex(path, **kwargs)

Support may rely on a Python Vortex library or an Arrow-based bridge if one becomes available.

Alternative Solutions

  • Convert Vortex → Parquet or Arrow → pandas
  • Use custom Python wrappers
  • Switch to other DataFrame libraries that add Vortex support

These options work but are less efficient and not user-friendly.

Additional Context

Vortex is designed for high-performance, compressed, columnar analytics and aligns well with Arrow-style data workflows. Native pandas interoperability would make it easier for the Python ecosystem to evaluate and adopt the format.

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementIO Format RequestRequest for a new format to support.Needs TriageIssue that has not been reviewed by a pandas team member

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions