Skip to content

Have the "where" operator (and search?) surface error values by default #6313

@philrz

Description

@philrz

After observing a community user's experience, there's consensus that the where operator (and perhaps search as well) should ensure that any error values it operates on should be surfaced rather than dropped as they are currently.

Repro Details

Repro is with super commit f11a16b. This issue was surfaced in a community Slack thread.

The user was upgrading from an older pre-release super version to a newer one that included the "syntactic sugar for array-wrapped subquery" added in #6225. They were adapting their query to the newer syntax and happened to make some mistakes along the way.

We'll use a simplified query based on theirs that illustrates the problems they experienced. Here it is working as intended in a newer super at commit f11a16b.

$ super -version
Version: f11a16be7

$ echo '[{foo:"-2"},{foo:"-1"},{foo:"0"}]' | super -c "
values [unnest this | foo:=foo::int64+1]
| unnest this
| where foo > 0" -

{foo:1}

Now imagine they're coming from an older version that used ( ) to wrap such subqueries, so running one of their broken query iterations happened to look like this:

$ echo '[{foo:"-2"},{foo:"-1"},{foo:"0"}]' | super -c "
values (unnest this | foo:=foo::int64+1)
| unnest this
| where foo > 0" -

[no output]

Since they knew there were foo input values and the query had worked at some point in the past, they were expecting to see something in the output. Given the silence, they felt their only option was to start commenting out portions of the program going backwards, which does indeed reveal error values that were being generated.

$ echo '[{foo:"-2"},{foo:"-1"},{foo:"0"}]' | super -c "
values (unnest this | foo:=foo::int64+1)
| unnest this
-- | where foo > 0" -

error({message:"unnest: encountered non-array value",on:error("query expression produced multiple values (consider [subquery])")})

Or if they wanted to see just the very first error encountered:

$ echo '[{foo:"-2"},{foo:"-1"},{foo:"0"}]' | super -c "
values (unnest this | foo:=foo::int64+1)
-- | unnest this
-- | where foo > 0" -

error("query expression produced multiple values (consider [subquery])")

This ultimately got them the info they needed, but they came to Slack to express:

It was just a bummer to have the original error ultimately swallowed up by follow-on commands.

Proposed Change

Right now the where docs do disclose:

The where operator filters its input by applying a Boolean expression <expr> to each input value and dropping each value for which the expression evaluates to false or to an error.

However, we can see here how the treatment of errors did not serve this user well. If they wrote their where expecting foo values to be present, but mistakes in the query or bad input data caused error values to be present where foo values would ordinarily appear, they'd probably prefer that the error values be surfaced by the where rather than dropped. That said, since there may be other use cases where such error values are seen as unwanted noise, allowing an easy way to "quiet" this behavior also seems desirable.

Since the search operator has a similar "silence when nothing found" approach, it could probably benefit from the same enhancement.

Inspiration

This is all somewhat similar to grep at the Unix shell which is also silent by default.

$ echo "foo" | grep "nothing"
[no output]

However, if something upstream surfaces a problem on stderr, that's still visible on a terminal by default.

$ { echo "foo"; echo "ERROR: something went wrong" >&2; } | grep "nothing"
ERROR: something went wrong

But of course can be included in the search or dumped entirely if that's the user's preference.

$ { echo "foo"; echo "ERROR: something went wrong" >&2; } 2>&1 | grep "nothing"
[no output]

$ { echo "foo"; echo "ERROR: something went wrong" >&2; } 2>/dev/null | grep "nothing"
[no output]

Related Issues

A couple othere related topics:

  1. As multiple errors accumulate in the pipeline, we've recognized these need to be "stacked", so anything we add here for where and search should be stacked. (Errors: Structured, detailed, & stacked #4608)

  2. As an alternative to commenting out parts of a query, it's been noted that the language does include a debug operator (first added in Debug operator #5196) but it's currently not documented because it's somewhat incomplete (Post-merge operator prevents debug on flowgraph branches #5230).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions