Skip to content

Conversation

@srielau
Copy link
Contributor

@srielau srielau commented Dec 22, 2025

What changes were proposed in this pull request?

Allow reference of built in functions with qualifiers builtin or system.builtin and temporary functions as session or system.session.

Why are the changes needed?

This extension allows users to disambiguate fucntion references and prepare for a following search path config.

Does this PR introduce any user-facing change?

Yes

How was this patch tested?

Added new tests

Was this patch authored or co-authored using generative AI tooling?

Yes: Claude Sonnet

@github-actions github-actions bot added the SQL label Dec 22, 2025
- Persistent functions now cached with unqualified keys for compatibility
- Temporary functions use composite keys (session.funcName)
- Both can coexist in the same registry
- Views correctly exclude temp functions from resolution

Issue: View test still failing - needs investigation into function builder resolution
- Persistent functions now stored with qualified keys (catalog.db.func)
- Prevents conflicts when multiple databases have same function name
- Temporary functions still use composite keys (session.func)

Known issues:
- View function resolution test still failing
- Possible function listing regressions to investigate
Added extensive debug logging to understand why views capture
wrong function class. Ready for detailed tracing.
**THE BUG:**
In resolveBuiltinOrTempFunctionInternal, the 'isBuiltin' parameter was
incorrectly checking if the temp/builtin identifier existed in the
session registry, instead of checking the static FunctionRegistry.builtin.

This caused lookupTempFuncWithViewContext to treat temp functions as
builtins, bypassing view context checks and allowing temp functions
created AFTER a view to incorrectly shadow the persistent function
that the view should use.

**THE FIX:**
Changed the isBuiltin check to use FunctionRegistry.builtin.functionExists
and TableFunctionRegistry.builtin.functionExists directly, matching
master's behavior.

**TEST RESULTS:**
✅ All 62 tests pass (PersistedViewTestSuite + FunctionQualificationSuite)
✅ SPARK-33692 view test now passes
✅ View correctly uses MyDoubleAvg and ignores temp MyDoubleSum
Removed leftover test scripts that were causing compilation errors:
- test_simple_function.scala
- test_view_function.scala

All code now compiles cleanly.
Analysis covers:
- Complete API surface (read/write operations)
- Current architecture and memory usage
- Three proposed optimization approaches
- Detailed feasibility assessment

KEY DISCOVERY: Internal functions already use separate static registry!
- FunctionRegistry.internal contains ~20 ML/Pandas/Connect functions
- Resolved directly, bypassing SessionCatalog
- Proves composite registry pattern works in production
- Validates proposed optimization approach

Memory savings potential: 98% reduction for high-session deployments
Implementation effort: 2-3 days coding + testing
Risk: Low (pattern already proven with internal functions)
Comprehensive comparison covering:
- Registry architecture (cloned vs static)
- User-facing vs implementation details
- Resolution paths and shadowing behavior
- Examples and use cases
- Historical context (Spark 4 separation)

KEY FINDINGS:
- Builtin: ~500 user-facing SQL functions, cloned per session
- Internal: ~20 implementation functions for Connect/ML/Pandas, single global registry
- Internal functions already use separate static registry pattern
- Proves composite registry approach is production-ready

This validates our proposed optimization approach for builtins.
Updated test to expect INVALID_TEMP_OBJ_QUALIFIER (AnalysisException)
instead of INVALID_SQL_SYNTAX.CREATE_TEMP_FUNC_WITH_DATABASE (ParseException)
for invalid temporary function qualifications.

This aligns with the user's request to treat invalid temp function
qualifications as semantic errors (42602 SQLSTATE) rather than
syntax errors.

Test cases updated:
- CREATE TEMPORARY FUNCTION a.b() - now expects INVALID_TEMP_OBJ_QUALIFIER
- CREATE TEMPORARY FUNCTION a.b.c() - now expects INVALID_TEMP_OBJ_QUALIFIER

All tests pass.
These files were working notes created during development and should
not be committed to the repository:
- BUILTIN_VS_INTERNAL_FUNCTIONS.md
- CURSOR_IMPLEMENTATION_SUMMARY.md
- CURSOR_TEST_RESULTS.md
- FUNCTION_QUALIFICATION_ANALYSIS.md
- FUNCTION_QUALIFICATION_COMPLETE.md
- FUNCTION_QUALIFICATION_SUMMARY.md
- FUNCTION_REGISTRY_API_ANALYSIS.md
- IMPLEMENTATION_COMPLETE.md
- TABLE_FUNCTION_REGISTRY_ANALYSIS.md
- UNIFIED_FUNCTION_NAMESPACE.md

Only production code and tests should be in the repository.
These were working test files created during development:
- test_both.sql
- test_namespace.sql
- test_range.sql

They should not be committed to the repository.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant