Skip to content

[FLINK-39458][table] Move Table type converters and serializers into new flink-table-type-utils module#27980

Open
autophagy wants to merge 2 commits intoapache:masterfrom
autophagy:FLINK-39458
Open

[FLINK-39458][table] Move Table type converters and serializers into new flink-table-type-utils module#27980
autophagy wants to merge 2 commits intoapache:masterfrom
autophagy:FLINK-39458

Conversation

@autophagy
Copy link
Copy Markdown
Contributor

@autophagy autophagy commented Apr 20, 2026

What is the purpose of the change

This PR extracts table api type utilities like data structure converters and serializers into a lightweight flink-table-type-utils module. This allows other modules to access the converters, serializers etc without pulling in heavy dependencies like runtime and planner.

This also involved moving a few classes out from flink-runtime to flink-core so that both runtime and the type-utils module can use them. They were implementations of interfaces already found in flink-core, so I felt that it would be okay to move them, as they weren't dependent on the runtime module at large. Still unsure if this is right, or whether they should exist in some new module.

An alternative approach I tried was just moving the DataConverters and using a kind of strategy pattern for converters that relied on some of the serializers, but this felt too clunky, and the serializers were already in a type-utils subdirectory, so felt appropriate to move.

Brief change log

  • Created new flink-table-type-utils module
  • Moved DataStructureConverter/DataStructureConverters and their implementations into the new submodule
  • Moved serializers from flink-table-runtime's type-utils subdirectory into the new submodule
  • Moved AbstractPagedInputView, AbstractPagedOutputView, ListMemorySegmentSource, RandomAccessInputView and RandomAccessOutputView from flink-runtime to flink-core to avoid circular dependencies (these are pure i/o classes and so didnt depend on the wider runtime module, and implement interfaces already defined in flink-core)

Verifying this change

  • This change is already covered by existing tests

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes - new flink-table-type-utils module)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
  • The serializers: (yes - moved to new module, no functional changes)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (no)
  • If yes, how is the feature documented? (not applicable)

@flinkbot
Copy link
Copy Markdown
Collaborator

flinkbot commented Apr 20, 2026

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@autophagy autophagy force-pushed the FLINK-39458 branch 2 times, most recently from 3a4212b to 297a60c Compare April 20, 2026 17:27
@autophagy autophagy marked this pull request as ready for review April 21, 2026 07:19
Copy link
Copy Markdown
Contributor

@twalthr twalthr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Overall the changes make a lot of sense. We had multiple discussions in the past whether we want to move type conversion related classes into a separate module. Imho the PR moved more than I expected. Overall for the test harness we only need "the new type stack", so basically everything under DataStructureConverters. Not necessarily DataFormatConverters. Very special serializers such as SortedMapSerializer can stay in runtime module. Here is a list of stuff that I identified during the review, not necessarily complete:

- WindowKey, WindowKeySerializerTest.java can stay in runtime
- RawValueDataAsserter.java into table-common?
- LinkedListSerializerTest can stay in runtime
- DataFormatTestUtil.rowDataToString to test which uses it.
- SortedMapSerializer, SortedMapTypeInfo, PagedTypeSerializer into table-runtime
- TypeCheckUtils stay in runtime if possible?
- DataFormatConverters and transitively e.g. TimestampDataTypeInfo stay in runtime
- LegacyTimestampTypeInfo stays in runtime

Comment thread flink-dist/src/main/assemblies/bin.xml Outdated
Comment thread flink-python/pom.xml Outdated
<version>${project.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sure it is included in table-runtime and thus we won't need this dependency in all pom.xml files

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still valid.

@autophagy
Copy link
Copy Markdown
Contributor Author

@twalthr Thank you for the feedback! Yeah, it looks like I was a little overzealous in moving lots over, keeping things confined to the structure converters and the necessary serializers makes sense.

One area where i've diverged from your suggestion is moving PagedTypeSerializer back to runtime. Unfortunately, it's required by BinaryRowDataSerializer/RowDataSerializer, which the BinaryWriter thats used by the ArrayObjectArrayConverter and MapMapConverters depend on. I'm not quite sure how to handle this, other than by stripping out the BinaryWriter usage from those converters (or injecting them in somehow from runtime). Unless those converters should also stay in runtime too? They seem fairly core converters.

Moving RawValueDataAsserter into table-common also seems like it'd create a circular dependency, as it'd need to pull the RawValueDataSerializer from the type-utils, which already depends on table-common.

Copy link
Copy Markdown
Contributor

@twalthr twalthr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update @autophagy. I still found 3 classes that we might not need to move as they are legacy classes:

AbstractMapTypeInfo
BigDecimalTypeInfo
StringDataTypeInfo

Comment thread flink-libraries/flink-state-processing-api/pom.xml Outdated
Comment thread flink-python/pom.xml Outdated
<version>${project.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still valid.

Comment thread flink-table/flink-table-planner/pom.xml Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants