New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

KAFKA-17743: Add minBytes implementation to DelayedShareFetch #17539

Open

adixitconfluent wants to merge 6 commits into apache:trunk from adixitconfluent:kafka-17743

+151 −58

Contributor

adixitconfluent commented Oct 18, 2024 •

edited

Loading

About

minBytes is a constraint that should be used to delay ShareFetch requests. Hence, I have added the support for minBytes in DelayedShareFetch class

Testing

The added code has been tested with the help of unit tests.

adixitconfluent added 3 commits

October 9, 2024 14:47


          Branch creation

dda4c86


          Added a utility function to get replica manager fetch data in case it…

e1880b9

…s more than minBytes


          Merge remote-tracking branch 'origin/trunk' into kafka-17743

604ffbd

github-actions bot added core KIP-932 small labels


          Added support for minBytes in DelayedShareFetch and fixed tests

f777e94

github-actions bot removed the small label


          Added more unit test

1ec8987

adixitconfluent marked this pull request as ready for review

October 18, 2024 12:18

omkreddy added the ci-approved label


          Empty commit to trigger build

afb8596

apoorvmittal10 requested review from apoorvmittal10, AndrewJSchofield and junrao

October 18, 2024 12:55

apoorvmittal10 reviewed

View reviewed changes

core/src/main/java/kafka/server/share/DelayedShareFetch.java

    
                              LogReadResult logResult = tpLogResult._2();

                              FetchPartitionData fetchPartitionData = logResult.toFetchPartitionData(false);

                              responseData.put(topicIdPartition, fetchPartitionData);

                              accumulatedBytes.addAndGet(logResult.info().records.sizeInBytes());

Collaborator

apoorvmittal10 Oct 18, 2024

Question for my understanding: Do we need to this calculation? As I can see fetch params already has minBytes in request to replica manager hence isn't the response from replica manager should be empty if minBytes criteria is not satisfied?

So the question arise that how do we differentiate between empty reponse from replica manager log read, if that's beacus of min bytes or there is no data in the log? In either case we should continue holding the request in purgatory? Wdyt?

Contributor Author

adixitconfluent Oct 18, 2024 •

edited

Loading

Hi @apoorvmittal10 , IIUC, minBytes is utilized in replicaManager.fetchMessages functionality here not in replicaManager.readFromLog. The way it calculates the accumulatedBytes is the same way I have done it in my code (original code reference). I don't see the usage of params.minBytes in readFromLog functionality

Collaborator

apoorvmittal10 Oct 18, 2024

I think you are right. I also see only reference of param minBytes in fetchMessages and not in readFromLog. Also the readFromLog says upto maximum in description and nothing about minBytes.

Then the PR change sounds good but I was wondering why do we accept complete fetchParams in readFromLog when we don't utilize something like minBytes there. Not sure if we should have minBytes support in readFromLog itself. Maybe out of scope of this PR.

@junrao can help is with more context.

adixitconfluent requested a review from apoorvmittal10

October 18, 2024 14:48

apoorvmittal10 reviewed

View reviewed changes

core/src/main/java/kafka/server/share/DelayedShareFetch.java

Comment on lines +234 to +237

    
                      // This call is coming from onComplete, hence we return the response data irrespective of whether minBytes is

                      // satisfied or not.

                      if (hasRequestTimedOut)

                          return responseData;

Collaborator

apoorvmittal10 Oct 18, 2024

Hmmm, is this same for regular fetch operations as well?

Contributor Author

adixitconfluent Oct 18, 2024 •

edited

Loading

Now that I think again, I should return a map with key as topic partition and value as fetchPartitionData object containing 0 records and since we have not been able to satisfy all the fetch request criterias. @junrao your thoughts?

core/src/main/java/kafka/server/share/DelayedShareFetch.java

Comment on lines +229 to +232

    
                          if (replicaManagerFetchSatisfyingMinBytes.isEmpty() && !hasRequestTimedOut) {

                              // Releasing the lock to move ahead with the next request in queue.

                              releasePartitionLocks(shareFetchData.groupId(), topicPartitionData.keySet());

                          }

Collaborator

apoorvmittal10 Oct 18, 2024

And why don't we want to release the partition locks from onComplete?

Contributor Author

adixitconfluent Oct 18, 2024 •

edited

Loading

in this case, we want to call ShareFetchUtils.processFetchResponse(shareFetchData, fetchResponseData, sharePartitionManager, replicaManager) before we want to release the locks. That part is in onComplete, hence we don't release the lock

core/src/main/java/kafka/server/share/DelayedShareFetch.java

    
                                                                                    boolean hasRequestTimedOut) {

                      log.trace("Fetchable share partitions data: {} with groupId: {} fetch params: {}", topicPartitionData,

                          shareFetchData.groupId(), shareFetchData.fetchParams());

                      Map<TopicIdPartition, FetchPartitionData> replicaManagerFetchSatisfyingMinBytes = new HashMap<>();

Collaborator

apoorvmittal10 Oct 18, 2024

In most scenarios the request might have minBytes, hence do you always want to initialize a hash map? Mostly it will be overriden with responseData map. So can't it be null? Moreover can't it be simpy a boolean variable i.e.

boolean minBytesSatisfied = false

if (accumulatedBytes.get() >= shareFetchData.fetchParams().minBytes)
replicaManagerFetchSatisfyingMinBytes = responseData;
=>
if (accumulatedBytes.get() >= shareFetchData.fetchParams().minBytes)
minBytesSatisfied = true;

if (replicaManagerFetchSatisfyingMinBytes.isEmpty() && !hasRequestTimedOut) {
=>
if (!minBytesSatisfied && !hasRequestTimedOut) {

return replicaManagerFetchSatisfyingMinBytes;
=>
return Collections.emptyMap()

Contributor Author

adixitconfluent Oct 18, 2024

yeah, it makes sense. I'll make the change.

core/src/main/java/kafka/server/share/DelayedShareFetch.java

    
                          replicaManagerFetchDataFromTryComplete = replicaManagerFetchData(topicPartitionData, false);

                          if (!replicaManagerFetchDataFromTryComplete.isEmpty())

                              return forceComplete();

                      }

                      log.info("Can't acquire records for any partition in the share fetch request for group {}, member {}, " +

Collaborator

apoorvmittal10 Oct 18, 2024

Now we can come to this code path getting no result from replicaManagerFetchData. Hence is the log line still correct?

Contributor Author

adixitconfluent Oct 18, 2024

You're right... I'll change it to log.info("Fetch cannot be completed for the partitions in the share fetch request for group {}, member {}, " + "topic partitions {}", shareFetchData.groupId(), shareFetchData.memberId(), shareFetchData.partitionMaxBytes().keySet())

core/src/main/java/kafka/server/share/DelayedShareFetch.java

    
                          // No locks for share partitions could be acquired, so we complete the request with an empty response.

                          shareFetchData.future().complete(Collections.emptyMap());

                          return;

                      Map<TopicIdPartition, FetchPartitionData> fetchResponseData;

Collaborator

apoorvmittal10 Oct 18, 2024

Do you need this extra variable or can just write later, if needed? And then no need of else block below.

replicaManagerFetchDataFromTryComplete = replicaManagerFetchData(topicPartitionData, true);

core/src/main/java/kafka/server/share/DelayedShareFetch.java

    
                          Map<TopicIdPartition, ShareFetchResponseData.PartitionData> result =

                              ShareFetchUtils.processFetchResponse(shareFetchData, responseData, sharePartitionManager, replicaManager);

                              ShareFetchUtils.processFetchResponse(shareFetchData, fetchResponseData, sharePartitionManager, replicaManager);

Collaborator

apoorvmittal10 Oct 18, 2024

fetchResponseData can still be empty, though processFetchResponse handles the empty check, is it intended? Though no harm, just checking with you.

Contributor Author

adixitconfluent Oct 18, 2024

yeah, since processFetchResponse can handle it, that's why I didn't add any check here

core/src/main/java/kafka/server/share/DelayedShareFetch.java

    
                          shareFetchData.future().complete(result);

                      } catch (Exception e) {

                          log.error("Error processing delayed share fetch request", e);

                          shareFetchData.future().completeExceptionally(e);

                      } finally {

                          // Releasing the lock to move ahead with the next request in queue.

                          releasePartitionLocks(shareFetchData.groupId(), topicPartitionData.keySet());

                          releasePartitionLocks(shareFetchData.groupId(), fetchResponseData.keySet());

Collaborator

apoorvmittal10 Oct 18, 2024

What about if partitions were locked but no response in data aarived then will the lock be correctly released?

Contributor Author

adixitconfluent Oct 18, 2024

for that case, even when the data is not received from replica manager, the fetchResponseData should still have keys as the locked topic partitions and values as empty data, so it should work. Am I wrong in that understanding?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-approved core KIP-932