refactor: Modularize compiler routing as proxy executor#16907
refactor: Modularize compiler routing as proxy executor#16907TrevorBergeron wants to merge 2 commits intomainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces the DualCompilerProxyExecutor to manage the transition between ibis and sqlglot compilers, moving fallback logic out of BigQueryCachingExecutor. The BigQueryCachingExecutor is updated to support specific compiler selection and shared caching. Review feedback points out several critical issues in the new proxy executor, including the use of an undefined self._main attribute, missing imports for bigquery and CacheConfig, and type errors when passing labels to ExecutionSpec. Additionally, a bug was identified in bq_caching_executor.py where extra_labels were not correctly passed to the query execution method.
| return self._main.to_sql( | ||
| array_value, | ||
| offset_column=offset_column, | ||
| ordered=ordered, | ||
| enable_cache=enable_cache, | ||
| ) |
There was a problem hiding this comment.
self._main is not defined in DualCompilerProxyExecutor. This method should route the to_sql call to the appropriate underlying executor based on the sql_compiler option, similar to the execute method.
compiler_option = bigframes.options.experiments.sql_compiler
executor = self._ibis_executor if compiler_option == "legacy" else self._sqlglot_executor
return executor.to_sql(
array_value,
offset_column=offset_column,
ordered=ordered,
enable_cache=enable_cache,
)| sql=sql, | ||
| job_config=job_config, | ||
| session=array_value.session, | ||
| extra_labels=self._labels, |
There was a problem hiding this comment.
The extra_labels argument passed to _run_execute_query should be the extra_labels parameter received by _export_gbq, not self._labels. _run_execute_query already incorporates self._labels internally. Using self._labels here causes the labels passed from the proxy executor (which track the compiler used) to be lost.
| extra_labels=self._labels, | |
| extra_labels=extra_labels, |
| import typing | ||
| import uuid | ||
| import warnings | ||
|
|
||
| import bigframes_vendored.ibis.expr.types as ibis_types | ||
| import google.cloud.exceptions | ||
|
|
||
| import bigframes.core | ||
| from bigframes import exceptions as bfe |
There was a problem hiding this comment.
Several necessary imports are missing or unused. bigquery and bigframes are used but not imported. Mapping and Optional are used as type hints but not imported from typing. Conversely, ibis_types is imported but never used.
import typing
from typing import Mapping, Optional
import uuid
import warnings
from google.cloud import bigquery
import google.cloud.exceptions
import bigframes
import bigframes.core
from bigframes import exceptions as bfe| return self._ibis_executor.execute(array_value, execution_spec) | ||
| elif compiler_option == "experimental": | ||
| return self._sqlglot_executor.execute( | ||
| array_value, execution_spec.add_labels((_COMPILER_LABEL_KEY, "sqlglot")) |
There was a problem hiding this comment.
ExecutionSpec.add_labels expects a Mapping[str, str], but a tuple is being passed here. This will cause an AttributeError when add_labels attempts to call .items() on the input.
| array_value, execution_spec.add_labels((_COMPILER_LABEL_KEY, "sqlglot")) | |
| array_value, execution_spec.add_labels({_COMPILER_LABEL_KEY: "sqlglot"}) |
| return self._sqlglot_executor.execute( | ||
| array_value, | ||
| execution_spec.add_labels( | ||
| (_COMPILER_LABEL_KEY, f"sqlglot-{correlation_id}") |
| return self._ibis_executor.execute( | ||
| array_value, | ||
| execution_spec.add_labels( | ||
| (_COMPILER_LABEL_KEY, f"ibis-{correlation_id}") |
| self, | ||
| array_value: bigframes.core.ArrayValue, | ||
| *, | ||
| config: CacheConfig, |
|
|
||
| Does not actually execute the data but will get stats and indicate any invalid query errors. | ||
| """ | ||
| return self._ibis_executor.dry_run(array_value, ordered=ordered) |
There was a problem hiding this comment.
The dry_run method should also be routed to the appropriate executor to ensure consistency between dry runs and actual execution.
compiler_option = bigframes.options.experiments.sql_compiler
executor = self._ibis_executor if compiler_option == "legacy" else self._sqlglot_executor
return executor.dry_run(array_value, ordered=ordered)
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
Fixes #<issue_number_goes_here> 🦕