Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add kRange preceding/following frames in window fuzzer #10006

Closed

Conversation

pramodsatya
Copy link
Collaborator

@pramodsatya pramodsatya commented Jun 2, 2024

  1. Adds support for kRange preceding/following frames in window fuzzer
  2. Adds reference query runner context for PrestoSql frame clause for kRange
    preceding/following frames.

Resolves #9572 .

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 2, 2024
Copy link

netlify bot commented Jun 2, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 906d532
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/6700320b91067500082ab351

@pramodsatya pramodsatya changed the title [WIP] Add reference query runner context to window fuzzer Add reference query runner context to window fuzzer Jun 11, 2024
@pramodsatya
Copy link
Collaborator Author

Hi @aditi-pandit, could you please help review these changes? The second commit contains changes specific to query runner context.

Copy link
Contributor

@kagamiori kagamiori left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @pramodsatya, thank you for putting together this draft! I left some comments. Have you tried whether the current code already work properly with PrestoQueryRunner?

velox/exec/fuzzer/AggregationFuzzerBase.cpp Outdated Show resolved Hide resolved
velox/exec/tests/utils/QueryAssertions.cpp Outdated Show resolved Hide resolved
velox/exec/fuzzer/WindowFuzzer.cpp Outdated Show resolved Hide resolved
velox/exec/fuzzer/WindowFuzzer.cpp Outdated Show resolved Hide resolved
velox/exec/fuzzer/AggregationFuzzerBase.cpp Outdated Show resolved Hide resolved
Comment on lines 128 to 205
if constexpr (std::is_same_v<T, double> || std::is_same_v<T, float>) {
return offsetCol[idx] - offsetValue;
} else {
return checkedMinus<T>(offsetCol[idx], offsetValue);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This piece of code is similar to lines 139-143. Could we reuse the code?

velox/exec/fuzzer/WindowFuzzer.cpp Outdated Show resolved Hide resolved
velox/exec/fuzzer/WindowFuzzer.cpp Outdated Show resolved Hide resolved
velox/exec/fuzzer/WindowFuzzer.cpp Outdated Show resolved Hide resolved
velox/exec/fuzzer/WindowFuzzer.cpp Outdated Show resolved Hide resolved
Copy link
Collaborator Author

@pramodsatya pramodsatya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback @kagamiori. We found some result mismatches with Presto while testing with these changes and are investigating them further.

velox/exec/fuzzer/AggregationFuzzerBase.cpp Outdated Show resolved Hide resolved
velox/exec/fuzzer/AggregationFuzzerBase.cpp Outdated Show resolved Hide resolved
velox/exec/fuzzer/ReferenceQueryRunner.h Outdated Show resolved Hide resolved
velox/exec/fuzzer/WindowFuzzer.cpp Outdated Show resolved Hide resolved
velox/exec/fuzzer/WindowFuzzer.cpp Outdated Show resolved Hide resolved
velox/exec/fuzzer/WindowFuzzer.cpp Outdated Show resolved Hide resolved
velox/exec/fuzzer/WindowFuzzer.cpp Outdated Show resolved Hide resolved
velox/exec/tests/utils/QueryAssertions.cpp Outdated Show resolved Hide resolved
@pramodsatya pramodsatya force-pushed the query_runner_ctx branch 3 times, most recently from 93ddd89 to 09e547b Compare June 28, 2024 03:01
@pramodsatya pramodsatya marked this pull request as ready for review June 28, 2024 03:04
Copy link
Collaborator

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pramodsatya. This is beginning to look solid.

velox/exec/fuzzer/AggregationFuzzerBase.cpp Outdated Show resolved Hide resolved
velox/exec/fuzzer/AggregationFuzzerBase.cpp Outdated Show resolved Hide resolved
velox/exec/fuzzer/AggregationFuzzerBase.cpp Outdated Show resolved Hide resolved
velox/exec/fuzzer/ReferenceQueryRunner.h Outdated Show resolved Hide resolved
velox/exec/fuzzer/WindowFuzzer.cpp Outdated Show resolved Hide resolved
velox/exec/fuzzer/WindowFuzzer.cpp Outdated Show resolved Hide resolved
velox/exec/fuzzer/WindowFuzzer.cpp Outdated Show resolved Hide resolved
std::is_same_v<T, Timestamp> ||
std::is_same_v<T, UnknownValue>)) {
auto size = vectorFuzzer_.getOptions().vectorSize;
velox::test::VectorMaker vectorMaker{pool_.get()};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need vectorMaker for the rowVector at the end ? It seems un-necessary.

newNames.push_back(columnName);
auto newChildren = input[i]->children();
newChildren.push_back(offsetColumn);
input[i] = vectorMaker.rowVector(newNames, newChildren);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use makeRowVector here. Do we really need vectorMaker ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is in VectorTestBase and we don't derive from this class, vectorMaker is also used in AggregationFuzzerBase::generateInputData. Creating a RowVectorPtr with std::make_shared requires more changes and vectorMaker looks more convenient here. Could you please share if this is fine?

velox/exec/fuzzer/WindowFuzzer.cpp Outdated Show resolved Hide resolved
@pramodsatya
Copy link
Collaborator Author

Hi @aditi-pandit, @kagamiori, could you please help review this PR?

Copy link
Collaborator

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pramodsatya.Only few minor comments left.

velox/exec/fuzzer/AggregationFuzzerBase.cpp Outdated Show resolved Hide resolved
velox/exec/fuzzer/WindowFuzzer.h Outdated Show resolved Hide resolved
velox/exec/fuzzer/WindowFuzzer.cpp Outdated Show resolved Hide resolved
velox/exec/fuzzer/WindowFuzzer.cpp Outdated Show resolved Hide resolved
@aditi-pandit aditi-pandit changed the title Add reference query runner context to window fuzzer Add kRange preceding/following frames in window fuzzer Jul 23, 2024
@pramodsatya
Copy link
Collaborator Author

Thanks @pramodsatya.Only few minor comments left.

Thanks @aditi-pandit, addressed the comments.

@aditi-pandit
Copy link
Collaborator

@pramodsatya : Please rebase your code. There is a conflict.

Copy link
Collaborator

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pramodsatya

@pramodsatya
Copy link
Collaborator Author

pramodsatya commented Sep 10, 2024

Hi @pramodsatya, I saw there is a window fuzzer failure in CI. Could you please take a look? https://github.com/facebookincubator/velox/actions/runs/10730400789/job/29760243003?pr=10006

Hi @kagamiori @aditi-pandit, it looks like the failure was because of the check for null values in the frameColumn in function WindowPartition::updateKRangeFrameBounds. The order by column index obtained from sortKeyInfo_, in the variable orderByColumn, already contains the order by column index in the reordered data (since WindowBuild constructs std::unique_ptr<RowContainer> data_; by placing partitionKeys and orderByKeys first). So the orderByRowColumn should be obtained from the inverse map of inputMapping_, so the right index (which maps to sortKeyInfo_.first in the inverse map) is queried in std::vector<exec::RowColumn> columns_; for the null check.

Eg: Input

-- Window[1][partition by [p0, p1] order by [s0 ASC NULLS LAST] w0 := skewness(ROW["c0"]) RANGE between UNBOUNDED PRECEDING and off1 PRECEDING] -> c0:BIGINT, p0:ROW<f0:REAL,f1:INTERVAL DAY TO SECOND,f2:REAL,f3:DATE,f4:INTERVAL DAY TO SECOND,f5:VARCHAR>, p1:ARRAY<BIGINT>, s0:HUGEINT, row_number:BIGINT, off1:HUGEINT, w0:DOUBLE
  -- Values[0][1000 rows in 10 vectors] -> c0:BIGINT, p0:ROW<f0:REAL,f1:INTERVAL DAY TO SECOND,f2:REAL,f3:DATE,f4:INTERVAL DAY TO SECOND,f5:VARCHAR>, p1:ARRAY<BIGINT>, s0:HUGEINT, row_number:BIGINT, off1:HUGEINT

Issue

column_index_t orderByColumn = sortKeyInfo_[0].first;    ==> 2
inputMapping_ : 3 0 1 2 4 5
inputMapping_[orderByColumn] = 1
RowColumn corresponding to orderByColumn is at index 3 in columns_

Could you please take another look?

Update: The recent fuzzer failure is unrelated to this change and the same error is seen on other PRs as well.

@kagamiori
Copy link
Contributor

Hi @pramodsatya, is that failure still reproducible? If so, could you share which command you used to reproduce it? Thanks!

@pramodsatya
Copy link
Collaborator Author

Hi @pramodsatya, is that failure still reproducible? If so, could you share which command you used to reproduce it? Thanks!

Hi @kagamiori, the failure is not reproducible currently. It can be reproduced by reverting the changes to WindowPartition.cpp in this PR, and with the following command:

velox_window_fuzzer_test --enable_window_reference_verification --presto_url="http://127.0.0.1:8080" --duration_sec=3600 --logtostderr=1 --minloglevel=0 --seed=2483741532 --req_timeout_ms=5000

Copy link
Contributor

@kagamiori kagamiori left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @pramodsatya, thank you for investigating the fuzzer failure! I can reproduce that bug, and as you explained, it was due to an incorrect use of inputMapping_ in WindowPartition::updateKRangeFrameBounds(). It's a great demonstration that this fuzzer enhancement enables us to catch a real bug.

Since this PR is already big, I wonder if we could separate the bug fix into another PR and add a unit test with that?

velox/exec/WindowPartition.cpp Outdated Show resolved Hide resolved
@facebook-github-bot
Copy link
Contributor

@kagamiori has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@pramodsatya
Copy link
Collaborator Author

Hi @pramodsatya, thank you for investigating the fuzzer failure! I can reproduce that bug, and as you explained, it was due to an incorrect use of inputMapping_ in WindowPartition::updateKRangeFrameBounds(). It's a great demonstration that this fuzzer enhancement enables us to catch a real bug.

Since this PR is already big, I wonder if we could separate the bug fix into another PR and add a unit test with that?

Thanks for the feedback @kagamiori, addressed the comments and opened another PR for the null check fix: #11075 . Could you please take another look?

@facebook-github-bot
Copy link
Contributor

@kagamiori has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@pramodsatya
Copy link
Collaborator Author

Hi @kagamiori, after the rebase, window fuzzer is failing with kRange frames when alternative plans are tested, I'm looking into it:

-- Window[2][STREAMING partition by [p0, p1] order by [s0 DESC NULLS FIRST] w0 := checksum(ROW["c0"]) RANGE between off0 FOLLOWING and off1 FOLLOWING] -> c0:INTERVAL DAY TO SECOND, p0:INTERVAL DAY TO SECOND, p1:BOOLEAN, s0:REAL, row_number:BIGINT, off0:REAL, k1:REAL, off1:REAL, w0:VARBINARY
  -- OrderBy[1][p0 ASC NULLS FIRST, p1 ASC NULLS FIRST, s0 DESC NULLS FIRST] -> c0:INTERVAL DAY TO SECOND, p0:INTERVAL DAY TO SECOND, p1:BOOLEAN, s0:REAL, row_number:BIGINT, off0:REAL, k1:REAL, off1:REAL
    -- Values[0][1000 rows in 10 vectors] -> c0:INTERVAL DAY TO SECOND, p0:INTERVAL DAY TO SECOND, p1:BOOLEAN, s0:REAL, row_number:BIGINT, off0:REAL, k1:REAL, off1:REAL

Expected 1000, got 1000
1 extra rows, 1 missing rows
1 of extra rows:
        2933652544787757131 | 4798976694194643045 | true | "Infinity" | 420 | "Infinity" | "NaN" | "NaN" | "Zw90+g6vltk="

1 of missing rows:
        2933652544787757131 | 4798976694194643045 | true | "Infinity" | 420 | "Infinity" | "NaN" | "NaN" | null

Copy link
Contributor

@kagamiori kagamiori left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you for adding this support, iterating on this PR, and fixing the fuzzer-found bug!

velox/vector/fuzzer/VectorFuzzer.h Outdated Show resolved Hide resolved
@pramodsatya
Copy link
Collaborator Author

Hi @kagamiori, after the rebase, window fuzzer is failing with kRange frames when alternative plans are tested, I'm looking into it:

-- Window[2][STREAMING partition by [p0, p1] order by [s0 DESC NULLS FIRST] w0 := checksum(ROW["c0"]) RANGE between off0 FOLLOWING and off1 FOLLOWING] -> c0:INTERVAL DAY TO SECOND, p0:INTERVAL DAY TO SECOND, p1:BOOLEAN, s0:REAL, row_number:BIGINT, off0:REAL, k1:REAL, off1:REAL, w0:VARBINARY
  -- OrderBy[1][p0 ASC NULLS FIRST, p1 ASC NULLS FIRST, s0 DESC NULLS FIRST] -> c0:INTERVAL DAY TO SECOND, p0:INTERVAL DAY TO SECOND, p1:BOOLEAN, s0:REAL, row_number:BIGINT, off0:REAL, k1:REAL, off1:REAL
    -- Values[0][1000 rows in 10 vectors] -> c0:INTERVAL DAY TO SECOND, p0:INTERVAL DAY TO SECOND, p1:BOOLEAN, s0:REAL, row_number:BIGINT, off0:REAL, k1:REAL, off1:REAL

Expected 1000, got 1000
1 extra rows, 1 missing rows
1 of extra rows:
        2933652544787757131 | 4798976694194643045 | true | "Infinity" | 420 | "Infinity" | "NaN" | "NaN" | "Zw90+g6vltk="

1 of missing rows:
        2933652544787757131 | 4798976694194643045 | true | "Infinity" | 420 | "Infinity" | "NaN" | "NaN" | null

Hi @kagamiori, we investigated this error and it seems to be because of the recently added changes which enable the vector fuzzer to generate Nan and Infinity values. Nan and Infinity values are now accounted for when constructing the frame offset column and the error is fixed now.
Could you please help review this fix? will squash to a single commit if the change looks good. Thanks!
cc: @minhancao

Copy link
Contributor

@kagamiori kagamiori left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @pramodsatya, I left one comment. The other part of this fix looks good to me. Thanks!

velox/exec/fuzzer/WindowFuzzer.cpp Outdated Show resolved Hide resolved
@pramodsatya
Copy link
Collaborator Author

Hi @pramodsatya, I left one comment. The other part of this fix looks good to me. Thanks!

Thanks for the suggestion @kagamiori, updated accordingly. Could you please help merge this change?

@facebook-github-bot
Copy link
Contributor

@kagamiori has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@kagamiori
Copy link
Contributor

kagamiori commented Oct 8, 2024

Hi @pramodsatya, FYI, there was an internal linter complaint about defining a static variable in the header file VectorFuzzer.h. So I reverted the change in VectorFuzzer.h and VectorFuzzer.cpp, but instead declare the function const std::vector<TypePtr>& defaultScalarTypes() in VectorFuzzer.h so that AggregationFuzzerBase.cpp can use defaultScalarTypes(). I have checked there was no build issue on my Mac and Linux.

I'm going to start the merging process now.

@pramodsatya
Copy link
Collaborator Author

Hi @pramodsatya, FYI, there was an internal linter complaint about defining a static variable in the header file VectorFuzzer.h. So I reverted the change in VectorFuzzer.h and VectorFuzzer.cpp, but instead declare the function const std::vector<TypePtr>& defaultScalarTypes() in VectorFuzzer.h so that AggregationFuzzerBase.cpp can use defaultScalarTypes(). I have checked there was no build issue on my Mac and Linux.

I'm going to start the merging process now.

Thanks for the update @kagamiori, sounds good. Please let me know if any other changes are needed.

@facebook-github-bot
Copy link
Contributor

@kagamiori merged this pull request in ce035c8.

Copy link

Conbench analyzed the 1 benchmark run on commit ce035c87.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

@Yuhta
Copy link
Contributor

Yuhta commented Oct 10, 2024

@pramodsatya Window fuzzer has been broken since this change, can you take a look?
https://github.com/facebookincubator/velox/actions/runs/11226206459/job/31207444517

@kagamiori
Copy link
Contributor

@pramodsatya Window fuzzer has been broken since this change, can you take a look? https://github.com/facebookincubator/velox/actions/runs/11226206459/job/31207444517

@Yuhta, I'm taking a look at this failure: #11213.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enhance WindowFuzzer to support k-range and k-rows frames with column boundary
5 participants