Introduce version-specific behavior in Spark expressions

### Is your feature request related to a problem or challenge?

Many Spark expressions have different behavior across Spark versions. This is especially the case when comparing Spark `3.x` and `4.x` where there are many breaking changes.

I think it is important to start addressing this in the Spark-compatible expressions in DataFusion.

fwiw, the approach we take in Comet is that each expression has a `getSupportLevel` method that can return `Compatible`, `Incompatible(reason)`, or `Unsupported`. These methods are context-aware based on Spark version, Spark configuration (e.g. ANSI mode enabled/disabled), and the specific arguments being passed to the expression. Here is an example:

```scala
override def getSupportLevel(expr: Reverse): SupportLevel = {
  if (containsBinary(expr.child.dataType)) {
    Incompatible(Some("reverse on array containing binary is not supported"))
  } else {
    Compatible(None)
  }
}
```

We also have a shim layer where we can implement different code per Spark version. This was necessary for Comet because there are API changes in Spark between versions and we need to compile per-version. It is simpler for DataFusion because we can just pass the Spark version (or some other flags) into the expression constructor.

### Describe the solution you'd like

_No response_

### Describe alternatives you've considered

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce version-specific behavior in Spark expressions #21698

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Introduce version-specific behavior in Spark expressions #21698

Description

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions