SQL Server: Support for full-text search TVFs (CONTAINSTABLE/FREETEXTTABLE)#37489
SQL Server: Support for full-text search TVFs (CONTAINSTABLE/FREETEXTTABLE)#37489Abde1rahman1 wants to merge 7 commits intodotnet:mainfrom
Conversation
235a837 to
9785ec0
Compare
roji
left a comment
There was a problem hiding this comment.
Thanks for submitting this PR - it's a good start but requires quite a lot of work. Translating a specialized TVF like this is quite challenging, and requires quite a bit of EF internals knowledge (it's not a good issue for a casual external contributor).
Unless you're wililng to invest significant time here and go into complicated internal architecture (just using AI here probably won't be sufficient) please let me know - I'm OK to take this over and complete it (and you can be happy with the work you've done here!).
Otherwise, see first round of comments below.
There was a problem hiding this comment.
Move these to SqlServerQueryableExtensions, similar to RelationalQueryableExtensions.
There was a problem hiding this comment.
Is this intentionally object as opposed to string? Is there a reason for that?
Also, is it possible for the full-text search column to be nullable? If so this parameter needs to be nullable too.
There was a problem hiding this comment.
Regarding the object type: I used object to stay consistent with other full-text extensions like SqlServerDbFunctionsExtensions.Contains, which takes an object property to allow the expression tree to capture the property access before it's translated. However, if the convention for Table-Valued Functions in EF Core prefers a specific type like string, I’m happy to change it.
Regarding nullability: Yes, full-text search columns can be nullable in SQL Server. I will update the parameter to be object? (or string?) to correctly reflect the provider's capabilities and avoid any potential issues with nullable properties
There was a problem hiding this comment.
It isn't ideal for this to be an untyped object, but I can't really see a way for this to be generic over the key type... The key column is determined in the CREATE FULLTEXT INDEX DDL, and we have no way of flowing that as a generic parameter here.
I'm thinking it might be a good idea to accept the key type as a generic type parameter to the FreeTextTable/ContainsTable methods; another alternative is to accept an additional parameter on the functions where the user selects the key property - that would allow us to flow that type to the return type, but it would be a bit silly (if the user specifies anything other than the key property from CREATE FULLTEXT INDEX, the call fails).
Any thoughts on this?
There was a problem hiding this comment.
I think the Generic approach (ContainsTable<TKey>) is the best way to go.
It’s simple and explicit. The user knows their database schema best, so they can just provide the correct type for the Key. This also keeps the translation logic clean and avoids the extra work of trying to infer the type from other properties.
If you're okay with this, I will update the code to use <TKey> and make FullTextTableResult generic
There was a problem hiding this comment.
Why introduce a new expression type? Is TableValuedFunctionExpression not suitable somehow?
There was a problem hiding this comment.
The reason I didn't use TableValuedFunctionExpression is that its Print method calls VisitCollection(Arguments). This treats all arguments as standard SqlExpression nodes, which the ExpressionPrinter translates into SQL literals (quoted strings) or parameters.
For CONTAINSTABLE, the first two arguments must be raw identifiers (Table and Column). By using a custom expression, I can bypass the standard Visit logic and use expressionPrinter.Append(Column.TableAlias) to print the table name directly without quotes, ensuring the SQL is syntactically correct for SQL Server
There was a problem hiding this comment.
We shouldn't need a whole new expression type for this. We should be able to use TableValuedFunctionExpression, with the first two arguments (the table and column names) just being SqlFragmentExpressions. We'd still need special handling in SqlServerQuerySqlGenerator to quote those names, but at least we wouldn't have an additional expression type.
test/EFCore.SqlServer.FunctionalTests/Query/SqlQuerySqlServerTest.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Note that the first argument to these functions can only be a literal table name - not a table alias. For example, the following does not work:
SELECT *
FROM Flags AS f
WHERE (
SELECT TOP(1) RANK
FROM CONTAINSTABLE (f, FlagColors, 'Green or Black')
ORDER BY RANK DESC) <> 0;Where replacing f with a table name (Flags) does:
SELECT *
FROM Flags AS f
WHERE (
SELECT TOP(1) RANK
FROM CONTAINSTABLE (Flags, FlagColors, 'Green or Black')
ORDER BY RANK DESC) <> 0;We can use SqlFragmentExpression to represent the table and column names.
There was a problem hiding this comment.
Thank you for the clarification. You're right, CONTAINSTABLE requires the actual table name rather than the alias.
I will refactor the implementation to use SqlFragmentExpression for both the table and column names. This ensures they are treated as raw SQL fragments and bypasses the quoting issues we discussed. I'll also make sure to pull the base table name from the metadata instead of using the alias
There was a problem hiding this comment.
Requiring invokers to do ctx.Set<T>().Select() to get the property like this is problematic:
- It isn't a good dev experience.
- If the user diverges from this pattern in any way (e.g. projecting something other than a simple property), this will fail.
- What happens if there's a Where before the Select? I think your code above would just ignore that (I may be wrong). In any case, these functions do not support pre-filtering or any other operator - they accept a literal table name as the source only.
To avoid all the above problems, we should instead try to make ContainsTable an extension method over DbSet; its source (first argument) would thus represent the table, and it would accept a lambda selector for the full-text property to be searched (and of course, the actual text to be searched):
var query =
from c in ctx.Set<Customer>()
join ct in ctx.Set<Customer>().ContainsTable(c => c.ContactName, "John")
....Makes sense?
There was a problem hiding this comment.
Sure. I’ll refactor the implementation to move ContainsTable and FreeTextTable as extension methods over IQueryable<TEntity> and update the visitor to handle this new pattern
There was a problem hiding this comment.
Please move tests to NorthwindDbFunctionsQuerySqlServerTest alongside the existing full-text search tests (unless there's a specific reason not to).
There was a problem hiding this comment.
I will move them
There was a problem hiding this comment.
We need a lot more testing here - one simple test isn't enough.
|
Thanks for your reply and for the feedback. I am definitely willing to invest the necessary time to work on this feature and dive deeper into the EF Core internal architecture. I'd be honored to learn from your expertise throughout this process. If you're available to provide guidance or point me in the right direction, I would be happy to put in the work and collaborate to get this implemented correctly. |
…est.cs Co-authored-by: Shay Rojansky <roji@roji.org>
…into feature-TVFs
|
Hi @roji
|
roji
left a comment
There was a problem hiding this comment.
Here are some more comments.
Please ensure that all tests pass before requesting another review.
| /// <param name="propertySelector">A lambda expression selecting the property to search.</param> | ||
| /// <param name="searchCondition">The search condition.</param> | ||
| /// <returns>The table-valued function result containing Key and Rank columns.</returns> | ||
| public static IQueryable<FullTextTableResult<TKey>> ContainsTable<TEntity, TKey>( |
There was a problem hiding this comment.
Note that the two SQL Server functions accept two additional parameters: language and top_n_by_rank.
There was a problem hiding this comment.
Make this a readonly struct, as it's a very think wrapper that can be immutable.
There was a problem hiding this comment.
We shouldn't need a whole new expression type for this. We should be able to use TableValuedFunctionExpression, with the first two arguments (the table and column names) just being SqlFragmentExpressions. We'd still need special handling in SqlServerQuerySqlGenerator to quote those names, but at least we wouldn't have an additional expression type.
There was a problem hiding this comment.
Do not call MakeGenericType, as it's incompatible with NativeAOT, which we're starting to move towards. You should already have the correct return type on the method in the inputted MethodCallExpression.
There was a problem hiding this comment.
Call the columns KEY and RANK (uppercase) as per the documentation.
There was a problem hiding this comment.
| var alias = _queryCompilationContext.SqlAliasManager.GenerateTableAlias("ft"); | |
| var alias = _queryCompilationContext.SqlAliasManager.GenerateTableAlias("fts"); |
There was a problem hiding this comment.
This is wrong - the select projects out both the key and the column; down below you're also mixing two projection type (this is a complicated and messy part of SelectExpression currently). Instantiate a SelectExpression without any projections, then call ReplaceProjection() with a dictionary argument to provide the two projections.
249ae47 to
6b86657
Compare
|
"Hi @roji , |
|
@Abde1rahman1 I've just submitted #37536, implementing support for the SQL Server [ |
|
@Abde1rahman1 as I haven't heard back from you, and as there was still considerably (and tricky) work remaining here, I went ahead and submitted #37578, which implements this. Thanks again for working on this! |
|
Hi @roji, I’m really sorry for the late response. I’ve been quite busy with some work commitments lately and couldn't give this the focus it deserved. I’m glad you moved forward with it, and I’ll definitely check out your PR to learn from your implementation. My schedule is much clearer now, so please let me know if there are any remaining tasks or other issues I can help with. I’d love to stay involved! Thanks for the opportunity |
Description
This PR implements support for SQL Server Full-Text Search Table-Valued Functions:
CONTAINSTABLEandFREETEXTTABLE.Currently, EF Core only supports scalar Full-Text functions (
CONTAINSandFREETEXT). This PR bridges the gap by allowing users to treat Full-Text results as queryable sources, enabling access to the Rank column and allowing for efficient Joins between full-text results and entity tables.Key Architectural Changes
FullTextTableExpression(inheriting fromTableExpressionBase) to represent the TVF in the relational command tree.SqlServerQueryableMethodTranslatingExpressionVisitorto interceptContainsTableandFreeTextTablemethod calls.SelectExpressionwrapped in aShapedQueryExpression.[Key]and[Rank]columns.SqlAliasManagerfor dynamic table alias generation.ShaperExpressionusingExpression.Newto bind SQL result columns to theFullTextTableResultDTO.VisitFullTextTableinSqlServerQuerySqlGeneratorto handle the specific TVF syntax requirements in SQL Server.Example Usage
Related Issue
Fixes #11487