TIKA-4679: Add HTTP/2 support to tika-server via Jetty http2-server#2672
TIKA-4679: Add HTTP/2 support to tika-server via Jetty http2-server#2672nddipiazza wants to merge 5 commits intomainfrom
Conversation
- Add tika-e2e-tests/tika-server module with TikaServerHttp2Test - Test starts the real fat-jar and verifies HTTP/2 (h2c) responses via Java HttpClient configured with Version.HTTP_2 - Wire module into tika-e2e-tests/pom.xml modules list - Module is skipped by default; enable with -Pe2e profile Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
I think we're good with Java 17. |
…h-check - Add Assumptions.assumeTrue(jar.exists()) so tests skip gracefully when tika-server-standard fat-jar hasn't been built (CI without prior install) - Change startup health-check from / to /status (more reliable 200 response) - Increase startup timeout to 90s for slower CI environments Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR adds HTTP/2 (h2c cleartext) support to tika-server by adding the org.eclipse.jetty.http2:http2-server jar as a dependency. CXF's Jetty transport automatically detects this jar on the classpath and enables h2c negotiation alongside HTTP/1.1 on the existing port. No application code changes are needed — just the dependency addition.
Changes:
- Added
http2-serverto the parent BOM dependency management and as a dependency intika-server-core - Added a unit test (
testH2c) inTikaServerIntegrationTestverifying HTTP/2 negotiation - Added a new
tika-e2e-tests/tika-servermodule with end-to-end tests that start the actual fat-jar and validate HTTP/2 (h2c) on both status and parse endpoints
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tika-parent/pom.xml | Adds http2-server artifact to the dependency management block using ${jetty.http2.version} |
| tika-server/tika-server-core/pom.xml | Adds http2-server as a compile dependency (version inherited from parent BOM) |
| TikaServerIntegrationTest.java | Adds testH2c() unit test using Java's HttpClient to verify HTTP/2 negotiation |
| tika-e2e-tests/pom.xml | Registers the new tika-server e2e test module |
| tika-e2e-tests/tika-server/pom.xml | New e2e module POM with surefire skip-by-default and -Pe2e profile activation |
| TikaServerHttp2Test.java | New e2e test class that starts the fat-jar process and validates h2c on status and parse endpoints |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
tika-e2e-tests/tika-server/src/test/java/org/apache/tika/server/e2e/TikaServerHttp2Test.java
Outdated
Show resolved
Hide resolved
tika-e2e-tests/tika-server/src/test/java/org/apache/tika/server/e2e/TikaServerHttp2Test.java
Show resolved
Hide resolved
- Use tika-server-standard assembly zip (unpacked via dependency plugin) instead of thin jar, so the required lib/ dependencies are available - Health-check endpoint changed from /status to / (root always returns 200; /status requires explicit endpoint config to be enabled) - Pre-negotiate h2c before PUT /tika parse test: h2c Upgrade requires a no-body request first; GET / establishes the HTTP/2 connection so the subsequent PUT reuses it correctly - Drop --noFork flag (TikaServerCli does not recognize it; server runs its own fork management independently) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove unused moduleDir variable; initialize repoRoot directly - stopServer() now uses waitFor(5s) + destroyForcibly() + waitFor(30s) to avoid indefinite blocking if SIGTERM doesn't terminate the process Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Addressed both Copilot review comments in commit 30b9ff3:
|
|
@copilot should i be trying to make the http2 option optional so people aren't forced to have that on classpath? |
Summary
Adds HTTP/2 (h2c cleartext) support to tika-server by including the
org.eclipse.jetty.http2:http2-serverjar on the classpath. When this jar is present, CXF's Jetty transport automatically negotiates HTTP/2 alongside HTTP/1.1 on the existing port (default 9998). Existing HTTP/1.1 clients are completely unaffected.This implements TIKA-4679. The core dependency change was originally contributed by Lawrence Moorehead (@elemdisc) — see elemdisc/tika PR#1 — and is cherry-picked here with full author credit.
Changes
tika-parent/pom.xml
http2-serverto the dependency management block alongside the existinghttp2-hpack,http2-client,http2-commonentries (all at${jetty.http2.version})tika-server/tika-server-core/pom.xml (Lawrence Moorehead's commit)
org.eclipse.jetty.http2:http2-serverruntime dependency (version from parent BOM)tika-server/tika-server-core/src/test/.../TikaServerIntegrationTest.java (Lawrence Moorehead's commit)
testH2c()unit test that sends a request viaHttpClient.Version.HTTP_2and asserts the response was served over HTTP/2tika-e2e-tests/tika-server/ (new module)
-Pe2etika-e2e-tests/pom.xmlHow it works
Adding
http2-serverto the classpath is sufficient for h2c (HTTP/2 cleartext) support. CXF'sJettyHTTPServerEngineFactorydetects the jar at startup and wires inHTTP2CServerConnectionFactory. No startup code changes are required.For h2 over TLS (recommended for production), configure
TlsConfigintika-server.json. Java 17's built-in ALPN handles protocol negotiation automatically — no separate ALPN agent is needed.Port management
EXPOSE 9998and health-check are unchangedShutdown note
HTTP/2 multiplexes multiple requests over a single TCP connection. The current
shutdownNow()path does not send a GOAWAY frame before closing. Under moderate load this is acceptable for h2c, but a future improvement could add a drain timeout for graceful HTTP/2 shutdown.Backward compatibility
Purely additive classpath change:
Testing Instructions
Manually with curl (after starting the server):
Review Checklist
http2-serverversion comes from${jetty.http2.version}in parent BOM (not hardcoded)TikaServerIntegrationTest#testH2cpasses-Pe2ePotential Concerns
jetty-alpn-java-serverdependency may be needed depending on the Jetty version and JVM. This can be addressed in a follow-up.http2-serverjar adds ~500 KB totika-server-standard. This also increases theapache/tikaDocker image slightly.