Skip to content

[Python] S3FileSystem ignores AWS_ENDPOINT_URL / AWS_ENDPOINT_URL_S3 environment variables #49643

@mohan-armada

Description

@mohan-armada

Describe the bug, including details regarding any error messages, version, and platform.

I am trying to use PyArrow S3FileSystem with a custom S3-compatible endpoint (MinIO).
When using endpoint_override explicitly, everything works:

fs.S3FileSystem(endpoint_override="http://10.148.0.2:9000")

However, when relying on environment variables:
AWS_ENDPOINT_URL_S3=http://10.148.0.2:9000
and initializing:
fs.S3FileSystem()
the request is still sent to AWS S3 instead of the custom endpoint, resulting in:

AWS Error ACCESS_DENIED during HeadObject operation

This suggests that environment-based endpoint configuration is not being honored.

Reproducible Example:

import pyarrow.fs as fs
import os

os.environ["AWS_ACCESS_KEY_ID"] = "xxx"
os.environ["AWS_SECRET_ACCESS_KEY"] = "xxx"
os.environ["AWS_ENDPOINT_URL_S3"] = "http://10.148.0.2:9000"
os.environ["AWS_S3_ADDRESSING_STYLE"] = "path"

s3 = fs.S3FileSystem()
print(s3.get_file_info("bucket/key"))

It works if I pass the endpoint

s3 = fs.S3FileSystem(endpoint_override="http://10.148.0.2:9000")

Expected Behavior:
S3FileSystem should connect to the configured endpoint (10.148.0.2:9000), using the ENV vars

Actual Behavior:
Requests are sent to AWS S3 (s3.amazonaws.com), ignoring endpoint env vars

Environment:
PyArrow version: 23.0.0
Python version: 3.11.11
Deployment: Kubernetes / Ray worker
S3 backend: MinIO

I understand AWS SDK may not officially support AWS_ENDPOINT_URL, but PyArrow provides S3-specific environment variables. It is unclear which are supported and how endpoint resolution is intended to work without endpoint_override.

Component(s)

Python

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions