-
Notifications
You must be signed in to change notification settings - Fork 525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade to google.golang.org/grpc v1.66.2 / modify certain protobuf messages to retain their unmarshaling buffer #9401
base: main
Are you sure you want to change the base?
Conversation
06c9166
to
c6b77ea
Compare
Some tests are breaking for this upgrade, have to fix them. |
7ed1789
to
b61cbbd
Compare
Edit: Memory unsafe optimizations are back again :) Solved by letting relevant protobuf messages keep an unmarshalling buffer reference. |
d79ea9f
to
f484e3a
Compare
833e5e6
to
a38a9e5
Compare
f574a11
to
3e5734b
Compare
3e5734b
to
f620574
Compare
Signed-off-by: Arve Knudsen <[email protected]>
f620574
to
fcd7178
Compare
pkg/mimirpb/split_test.go
Outdated
@@ -166,7 +166,7 @@ func TestSplitWriteRequestByMaxMarshalSize_WriteRequestHasChanged(t *testing.T) | |||
|
|||
// If the fields of WriteRequest have changed, then you will probably need to modify | |||
// the SplitWriteRequestByMaxMarshalSize() implementation accordingly! | |||
assert.ElementsMatch(t, []string{"Timeseries", "Source", "Metadata", "SkipLabelValidation", "skipUnmarshalingExemplars"}, fieldNames) | |||
assert.ElementsMatch(t, []string{"Timeseries", "Source", "Metadata", "SkipLabelValidation", "skipUnmarshalingExemplars", "buffer"}, fieldNames) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the most correct thing to do in SplitWriteRequestByMaxMarshalSize()
is to retain a reference of the original buffer
in each split WriteRequest
(and by calling bufferRef()
for each split WriteRequest
). We could also mention it in the SplitWriteRequestByMaxMarshalSize()
function doc.
Then where we call SplitWriteRequestByMaxMarshalSize()
(which is marshalWriteRequestToRecords()
) we should call the FreeBuffer()
on each split request once we've done marshalling it.
The only annoying thing of this approach is that SplitWriteRequestByMaxMarshalSize()
will just return the input WriteRequest
is not splitting happened at all. We could slightly modify it to also return a bool
to indicate if splitting happened, so that the caller will call FreeBuffer()
only if splitting happened.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I agree on this. These are internal things we do with the WriteRequest, they shouldn't affect the outer lifecycle: we had a WriteRequest, we call FreeBuffer after using it (and using it here implies splitting it into multiple ones).
Maybe we should clarify in the comment of SplitWriteRequestByMaxMarshalSize
that the requests returned in the slice might still retain references to the original WriteRequest.
What doesn't convince me is that here for some reason we would do a special thing: we sould retain more references to that buffer and free them separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you Oleg, and suggested adding the comment instead on internal slack thread as well.
pkg/mimirpb/custom.go
Outdated
// CodecV2 customizes gRPC unmarshalling. | ||
type CodecV2 struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rename this codec to reflect it's purpose, something like: BufferHolderCodec
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the name CodecV2
makes complete sense, since it wraps the codecV2
type in grpc/encoding/proto
and it implements the grpc/encoding.CodecV2
interface. My reasoning is that if the naming makes sense for those, the same holds for our wrapper type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, our codec, that does special things, should reflect in it's name what's the special thing it's doing. Their codecV2
name is because it's the default implementation.
BTW, do we need ours to be exported? Edit: I see you have unexported it.
To be clear: this is opinionated, not a blocking comment.
|
||
// CodecV2 customizes gRPC unmarshalling. | ||
type CodecV2 struct { | ||
encoding.CodecV2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not embed this here, but convert it to a field and implement the two methods missing.
Otherwise it's unclear what are the other things that we're inheriting from the orignal codec: for example here this means that our codec has the same Name()
as the parent: is that ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's definitely OK that Name()
is the same as encoding.CodecV2
, as it should replace it:
func init() {
c := encoding.GetCodecV2(proto.Name)
encoding.RegisterCodecV2(&CodecV2{c})
}
I think that embedding is good, because our implementation is supposed to only customize the Unmarshal
method of the standard CodecV2
.
If this case isn't right for struct embedding, then I wonder which is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that embedding is good, because our implementation is supposed to only customize the Unmarshal method of the standard CodecV2.
We can still "forward" implementation of other methods to original implementation, if we use named field. It would just be more explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes of course, it's possible to do the same without embedding. Help me understand though, why should we not use the embedding technique in this case? Are you against it in principle? If embedding is not the right technique in this case, why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this case isn't right for struct embedding, then I wonder which is?
I think struct embedding is valid in codebases that respect backwards incompatible changes. Given that the grpc package makes breaking changes in minor versions, it would be very useful, IMO, to detect that they suddenly added a new method to the CodecV2 interface that changes the behaviour in some drastic way.
We can still "forward" implementation of other methods to original implementation, if we use named field. It would just be more explicit.
Yes, that's what I meant.
type CodecV2 struct {
wrapped encoding.CodecV2
}
func (c *CodecV2) Name() string { return c.wrapped.Name() }
Etc.
Edit: to be clear, it's my opinion, not a blocking comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we should not use embedding. (Personally I think it's a good practice to avoid embedding by default, since it hides incompatible changes, and it often exposes more methods to "outer" type than expected or necessary. Typical misuse of embedding is embedding sync.Lock
. This second mentioned issue is not the case in this specific instance.)
Co-authored-by: Oleg Zaytsev <[email protected]>
Signed-off-by: Arve Knudsen <[email protected]>
Signed-off-by: Arve Knudsen <[email protected]>
Signed-off-by: Arve Knudsen <[email protected]>
Signed-off-by: Arve Knudsen <[email protected]>
Signed-off-by: Arve Knudsen <[email protected]>
Signed-off-by: Arve Knudsen <[email protected]>
What this PR does
Upgrade google.golang.org/grpc from v1.65.0 to v1.66.2. The performance regression referenced from dskit's reversal to v1.65.0 is fixed in v1.66.2, and the potential functional regression also referenced in said reversal is Loki specific.
What we found out is that v1.66 fixes a serious performance bottleneck in conjunction with compression; the
decompress
function is optimized through memory pooling, whereas v1.65.0 would useio.ReadAll
and cause a lot of allocations during decompression. As a result, we saw ingester CPU usage increase by ~40% when enabling gRPC compression. The increase is now much more modest, maybe ~7% based on observations.In order to keep allowing for unsafe references to unmarshalling buffers in e.g.
LabelAdapter
, I had to introduce a custom gRPC unmarshalling hook (mimirpb.CodecV2.Unmarshal
) plus a scheme for protobuf messages to keep a reference to their unmarshalling buffer (mimirpb.UnmarshalerV2
). The underlying reason is that from v1.66 on, gRPC immediately recycles each buffer after unmarshalling, unless a reference is kept. This causes data races with e.g.LabelAdapter
taking unsafe string references to the unmarshal buffer, unless, as mentioned, a buffer reference is kept with the root protobuf message.Ideally, one should call
FreeBuffer
on protobuf messages keeping an unmarshalling buffer reference so the buffer can be given back to the gRPC pool. If this isn't done though, there also shouldn't be a memory leak since the buffer should be garbage collected.Which issue(s) this PR fixes or relates to
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]
.about-versioning.md
updated with experimental features.