-
Notifications
You must be signed in to change notification settings - Fork 2.2k
[Gateway V2][DO NOT MERGE]: Integrate Stored Procedure and Query Plan request routing to a Gateway V2 endpoint. #47759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Gateway V2][DO NOT MERGE]: Integrate Stored Procedure and Query Plan request routing to a Gateway V2 endpoint. #47759
Conversation
|
/azp run java - cosmos - tests |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run java - cosmos - tests |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run java - cosmos - tests |
|
Azure Pipelines successfully started running 1 pipeline(s). |
| @BeforeClass(groups = {"thinclient"}) | ||
| public void beforeClass() { | ||
| // If running locally, uncomment these lines | ||
| // System.setProperty("COSMOS.THINCLIENT_ENABLED", "true"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we still need these? probably can clean up now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least COSMOS.THINCLIENT_ENABLED is always needed from an environment variable perspective (I'll clean the HTTP/2 enabled environment variable).
|
|
||
| if (allDocs != null && !allDocs.isEmpty()) { | ||
| for (ObjectNode doc : allDocs) { | ||
| String id = doc.get(ID_FIELD).asText(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems can use the following deleteDocuments() method below. And also not sure how many docs need to clean each time, using bulk can be faster
| for (Map.Entry<String, AggregateOperator> aliasToAggregate : aggregateAliasToAggregateType.entrySet()) { | ||
| String alias = aliasToAggregate.getKey(); | ||
| AggregateOperator aggregateOperator = null; | ||
| Object aliasAggregateOperator = aliasToAggregate.getValue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defined but not used?
| requestHeaders); | ||
| queryPlanRequest.useGatewayMode = true; | ||
|
|
||
| // queryPlanRequest.useGatewayMode = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove
| } | ||
|
|
||
| // 3. Execute SELECT * FROM C WHERE c.id = @id query | ||
| String query = "SELECT * FROM c WHERE c.id = @id"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious -> does full text search etc supported?does not see the tests here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unsure - (likewise about vector search / hybrid search). I'll confirm with the proxy service team.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should work because there is no specific logic in Gateway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should have a pipeline running all the query tests in Gateway mode with thin client - let's sync-up at our JVM meeting today.
xinlian12
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks
| @@ -46,80 +54,360 @@ | |||
| public class ThinClientE2ETest { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a though: Why not use the existing tests and use the clientBuilders pattern instead and add a thinproxy clientbuilder. In this way, we can slowly add this provider to as many existing tests as we want and have them run on thinproxy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mbhaskar - sounds good. I'll make this change.
| try { | ||
| // Use Jackson to deserialize using PartitionKeyInternal's custom deserializer | ||
| return Utils.getSimpleObjectMapper().treeToValue(node, PartitionKeyInternal.class); | ||
| } catch (Exception e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Catch specific exeption types? like JsonProcessingException
FabianMeiswinkel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - Thanks
PR Description
Title
Add Query Plan and Execute Stored Procedure Support for Thin Client Mode in Azure Cosmos DB Java SDK
Summary
This PR adds support for routing
QueryPlanandExecuteJavaScript(stored procedure execution) operations through the thin client proxy endpoint, enabling these operations to use the same optimized pathway as other thin client operations. Additionally, it includes comprehensive end-to-end tests for thin client query and stored procedure functionality.Changes
Core Changes
Stored Procedure Execution Support
RxDocumentServiceRequest.java:isExecuteStoredProcedureBasedRequest()method to identify stored procedure execution requests (ResourceType.StoredProcedure+OperationType.ExecuteJavaScript)RxDocumentClientImpl.java:useThinClientStoreModel()to allow stored procedure execution requests through thin client:request.isExecuteStoredProcedureBasedRequest()in resource type validationrequest.isExecuteStoredProcedureBasedRequest()to supported operations listQuery Plan Support
RxDocumentClientImpl.java:useThinClient(RxDocumentServiceRequest)method toIDocumentQueryClientinterface implementationgetStoreModel()to routeQueryPlanoperations to gateway proxy when not using thin clientOperationType.QueryPlanto supported thin client operationsThinClientStoreModel.java:QueryPlanoperations to allow null resolved partition key range (since query plans don't require partition resolution)QueryPlanRetriever.java:
getQueryPlanThroughGatewayAsync()to acceptDocumentCollectionparameterqueryClient.useThinClient()PartitionKeyDefinitiontoPartitionedQueryExecutionInfofor EPK conversionPartitionedQueryExecutionInfo.java:
useThinClientModeandpartitionKeyDefinitionfieldsparseQueryRangesForThinClient()method to manually parse query ranges from thin client proxy response formatparsePartitionKeyInternal()to convert JSON representation of partition keys toPartitionKeyInternalobjectsDocumentQueryExecutionContextFactory.java:
QueryPlanRetriever.getQueryPlanThroughGatewayAsync()to pass thecollectionparameterIDocumentQueryClient.java:useThinClient(RxDocumentServiceRequest)method to the interfaceTest Coverage
ThinClientE2ETest.java:
@BeforeClass/@AfterClasslifecycle methodsBasic Query Tests:
testThinClientQuerySelectAll()- TestsSELECT * FROM cquery with full pagination drainingtestThinClientQuerySelectById()- TestsSELECT * FROM c WHERE c.id = @idquerytestThinClientQueryWithPartitionKeyOption()- TestsSELECT * FROM cwith partition key inCosmosQueryRequestOptionstestThinClientQueryLegacy()- Legacy query test with parameterized queryQuery Plan Feature Tests:
testThinClientQueryPlanOrderBy()SELECT * FROM c ORDER BY c.sortFieldtestThinClientQueryPlanAggregate()SELECT VALUE COUNT(1) FROM ctestThinClientQueryPlanWithPartitionKeyFilterSingleRange()SELECT * FROM c WHERE c.id = @idtestThinClientQueryPlanDistinct()SELECT DISTINCT VALUE c.category FROM ctestThinClientQueryPlanTop()SELECT TOP 10 * FROM ctestThinClientQueryPlanGroupBy()SELECT c.category, COUNT(1) FROM c GROUP BY c.categorytestThinClientQueryPlanInvalidQuery()SELEC * FORM ctestThinClientQueryPlanOffsetLimit()SELECT * FROM c ORDER BY c.idx OFFSET 5 LIMIT 5Stored Procedure Test:
testThinClientStoredProcedure()- Tests sproc creation and execution (creates a document via sproc)All tests assert that only thin-client endpoints are used
Technical Details: QueryPlan Response Format Differences
The thin client proxy and standard gateway return query plan responses in different formats for the
queryRangesfield. This PR adds parsing logic to handle the thin client format.Thin Client Proxy Response
Returns partition key values in raw
PartitionKeyInternalJSON array format:{ "queryInfo": { "distinctType": "None", "groupByExpressions": [], "groupByAliases": [], "orderBy": [], "orderByExpressions": [], "aggregates": [], "hasSelectValue": 0, "rewrittenQuery": "", "groupByAliasToAggregateType": {}, "hasNonStreamingOrderBy": 0 }, "queryRanges": [{ "min": ["b3a69415-c069-478a-a553-1a3340b581f0"], "max": ["b3a69415-c069-478a-a553-1a3340b581f0"], "isMinInclusive": true, "isMaxInclusive": true }] }Key difference:
minandmaxare JSON arrays representingPartitionKeyInternalvalues (e.g.,["b3a69415-c069-478a-a553-1a3340b581f0"]).Standard Gateway Response
Returns pre-computed EPK (Effective Partition Key) hex strings:
{ "partitionedQueryExecutionInfoVersion": 2, "queryInfo": { "distinctType": "None", "top": null, "offset": null, "limit": null, "orderBy": [], "orderByExpressions": [], "groupByExpressions": [], "groupByAliases": [], "aggregates": [], "groupByAliasToAggregateType": {}, "rewrittenQuery": "", "hasSelectValue": false, "dCountInfo": null, "hasNonStreamingOrderBy": false }, "queryRanges": [{ "min": "24F3B4E0560B48E552D6CCDAA7837C94", "max": "24F3B4E0560B48E552D6CCDAA7837C94", "isMinInclusive": true, "isMaxInclusive": true }], "hybridSearchQueryInfo": null }Key difference:
minandmaxare hex strings representing the computed EPK hash values.Deserialization Logic
The
PartitionedQueryExecutionInfo.getQueryRanges()method now branches based on thin client mode:queryRangesFormat"min": "24F3B4E0..."(hex string)getList()toList<Range<String>>"min": ["value"](JSON array)PartitionKeyInternal, then convert to EPK hex string viaPartitionKeyInternalHelper.getEffectivePartitionKeyString()The conversion uses the
PartitionKeyDefinitionfrom theDocumentCollectionto properly compute the EPK hash based on the partition key type (Hash, MultiHash, etc.).Operations Now Supported via Thin Client
Testing
Made changes.
All SDK Contribution checklist:
General Guidelines and Best Practices
Testing Guidelines