release prep for v1.4.10#91
Merged
ppetter1025 merged 21 commits intoMay 9, 2026
Merged
Conversation
Collaborator
ppetter1025
commented
May 7, 2026
- treewide: Replace kmalloc with kmalloc_obj for non-scalar types
- Convert 'alloc_obj' family to use the new default GFP_KERNEL argument
- Convert more 'alloc_obj' cases to default GFP_KERNEL arguments
- Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses
- gve: fix incorrect buffer cleanup in gve_tx_clean_pending_packets for QPL
- gve: Update QPL page registration logic
- gve: Enable reading max ring size from the device in DQO-QPL mode
- gve: Advertise NETIF_F_GRO_HW instead of NETIF_F_LRO
- gve: fix SW coalescing when hw-GRO is used
- gve: pull network headers into skb linear part
- gve: Enable hw-gro by default if device supported
- gve: add support for UDP GSO for DQO format
- cocci: Update kernel version of ipv6_hopopt_jumbo_remove
- cocci: Apply netdev_lock backports to RHEL 10.2 and above
- cocci: Apply XDP locking changes to RHEL10.2 and above
- Add upstream_test command to build.sh and fix kokoro build error
- cocci: Add patch for kmalloc_objs family
- build_src: Run spatch tool in parallel per-file
- Bump driver version to v1.5.0
This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook <kees@kernel.org> net-next: 69050f8d6d075dc01af7a5f2f550a8067510366f
This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
net-next: bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43
This converts some of the visually simpler cases that have been split over multiple lines. I only did the ones that are easy to verify the resulting diff by having just that final GFP_KERNEL argument on the next line. Somebody should probably do a proper coccinelle script for this, but for me the trivial script actually resulted in an assertion failure in the middle of the script. I probably had made it a bit _too_ trivial. So after fighting that far a while I decided to just do some of the syntactically simpler cases with variations of the previous 'sed' scripts. The more syntactically complex multi-line cases would mostly really want whitespace cleanup anyway. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> net-next: 32a92f8c89326985e05dce8b22d3f0aa07a3e1bd
Conversion performed via this Coccinelle script: // SPDX-License-Identifier: GPL-2.0-only // Options: --include-headers-for-types --all-includes --include-headers --keep-comments virtual patch @gfp depends on patch && !(file in "tools") && !(file in "samples")@ identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex, kzalloc_obj,kzalloc_objs,kzalloc_flex, kvmalloc_obj,kvmalloc_objs,kvmalloc_flex, kvzalloc_obj,kvzalloc_objs,kvzalloc_flex}; @@ ALLOC(... - , GFP_KERNEL ) $ make coccicheck MODE=patch COCCI=gfp.cocci Build and boot tested x86_64 with Fedora 42's GCC and Clang: Linux version 6.19.0+ (user@host) (gcc (GCC) 15.2.1 20260123 (Red Hat 15.2.1-7), GNU ld version 2.44-12.fc42) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Linux version 6.19.0+ (user@host) (clang version 20.1.8 (Fedora 20.1.8-4.fc42), LLD 20.1.8) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> net-next: 189f164e573e18d9f8876dbd3ad8fcbe11f93037
… QPL
In DQ-QPL mode, gve_tx_clean_pending_packets() incorrectly uses the RDA
buffer cleanup path. It iterates num_bufs times and attempts to unmap
entries in the dma array.
This leads to two issues:
1. The dma array shares storage with tx_qpl_buf_ids (union).
Interpreting buffer IDs as DMA addresses results in attempting to
unmap incorrect memory locations.
2. num_bufs in QPL mode (counting 2K chunks) can significantly exceed
the size of the dma array, causing out-of-bounds access warnings
(trace below is how we noticed this issue).
UBSAN: array-index-out-of-bounds in
drivers/net/ethernet/google/gve/gve_tx_dqo.c:178:5 index 18 is out of
range for type 'dma_addr_t[18]' (aka 'unsigned long long[18]')
Workqueue: gve gve_service_task [gve]
Call Trace:
<TASK>
dump_stack_lvl+0x33/0xa0
__ubsan_handle_out_of_bounds+0xdc/0x110
gve_tx_stop_ring_dqo+0x182/0x200 [gve]
gve_close+0x1be/0x450 [gve]
gve_reset+0x99/0x120 [gve]
gve_service_task+0x61/0x100 [gve]
process_scheduled_works+0x1e9/0x380
Fix this by properly checking for QPL mode and delegating to
gve_free_tx_qpl_bufs() to reclaim the buffers.
Cc: stable@vger.kernel.org
Fixes: a6fb8d5a8b69 ("gve: Tx path for DQO-QPL")
Signed-off-by: Ankit Garg <nktgrg@google.com>
Reviewed-by: Jordan Rhee <jordanrhee@google.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: Joshua Washington <joshwash@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260220215324.1631350-1-joshwash@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net-next: fb868db5f4bccd7a78219313ab2917429f715cea
For DQO, change QPL page registration logic to be more flexible to honor the "max_registered_pages" parameter from the gVNIC device. Previously the number of RX pages per QPL was hardcoded to twice the ring size, and the number of TX pages per QPL was dictated by the device in the DQO-QPL device option. Now [in DQO-QPL mode], the driver will ignore the "tx_pages_per_qpl" parameter indicated in the DQO-QPL device option and instead allocate up to (tx_queue_length / 2) pages per TX QPL and up to (rx_queue_length * 2) pages per RX QPL while keeping the total number of pages under the "max_registered_pages". Merge DQO and GQI QPL page calculation logic into a unified gve_update_num_qpl_pages function. Add rx_pages_per_qpl to the priv struct for consumption by both DQO and GQI. Signed-off-by: Matt Olson <maolson@google.com> Signed-off-by: Max Yuan <maxyuan@google.com> Reviewed-by: Jordan Rhee <jordanrhee@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Link: https://patch.msgid.link/20260225182342.1049816-2-joshwash@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> net-next: 07993df560917357610e0625a9a2e7531c3211fc
The gVNIC device indicates a device option (MODIFY_RING) to the driver, which presents a range of ring sizes from which the user is allowed to select. But in DQO-QPL queue format, the driver ignores the "max" of this range and instead allows the user to configure the ring size in the range [min, default]. This was done because increasing the ring size could result in the number of registered pages being higher than the max allowed by the device. In order to support large ring sizes, stop ignoring the "max" of the range presented in the MODIFY_RING option. Signed-off-by: Matt Olson <maolson@google.com> Signed-off-by: Max Yuan <maxyuan@google.com> Reviewed-by: Jordan Rhee <jordanrhee@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Link: https://patch.msgid.link/20260225182342.1049816-3-joshwash@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> net-next: a2f19184014f309165d2d4cfb41088b75c1121a4
The device behind DQO format has always coalesced packets per stricter hardware GRO spec even though it was being advertised as LRO. Update advertised capability to match device behavior. Signed-off-by: Ankit Garg <nktgrg@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Link: https://patch.msgid.link/20260303195549.2679070-2-joshwash@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> net-next: e637c244b954426b84340cbc551ca0e2a32058ce
Leaving gso_segs unpopulated on hardware GRO packet prevents further coalescing by software stack because the kernel's GRO logic marks the SKB for flush because the expected length of all segments doesn't match actual payload length. Setting gso_segs correctly results in significantly more segments being coalesced as measured by the result of dev_gro_receive(). gso_segs are derived from payload length. When header-split is enabled, payload is in the non-linear portion of skb. And when header-split is disabled, we have to parse the headers to determine payload length. Signed-off-by: Ankit Garg <nktgrg@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jordan Rhee <jordanrhee@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Link: https://patch.msgid.link/20260303195549.2679070-3-joshwash@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> net-next: ea4c1176871fd70a06eadcbd7c828f6cb9a1b0cd
Currently, in DQO mode with hw-gro enabled, entire received packet is placed into skb fragments when header-split is disabled. This leaves the skb linear part empty, forcing the networking stack to do multiple small memory copies to access eth, IP and TCP headers. This patch adds a single memcpy to put all headers into linear portion before packet reaches the SW GRO stack; thus eliminating multiple smaller memcpy calls. Additionally, the criteria for calling napi_gro_frags() was updated. Since skb->head is now populated, we instead check if the SKB is the cached NAPI scratchpad to ensure we continue using the zero-allocation path. Signed-off-by: Ankit Garg <nktgrg@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Link: https://patch.msgid.link/20260303195549.2679070-4-joshwash@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> net-next: 0c7025fd24db5b2f8cbd2e1f0050c033b923fd48
Change the driver's default behavior to enable hw-gro whenever supported for device. Performance observations: - We observed ~10% improvement in RX single stream throughput across various MTU sizes. - No change in TCP_RR/TCP_CRR latencies Signed-off-by: Ankit Garg <nktgrg@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Link: https://patch.msgid.link/20260303195549.2679070-5-joshwash@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> net-next: 3c398063ef01b02d7efd31662154fe70fd28ace6
Enable support for UDP GSO when using DQO format. Advertise the feature flag during device initialization and enable offload by default. Signed-off-by: Ankit Garg <nktgrg@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20260306224816.3391551-1-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> net-next: 014c607f86abc903d7bf46e13373d89392e371fe
ipv6_hopopt_jumbo_remove() is no longer needed for >= v7.0.0 kernel. Update the macro accordingly. This is a squash of tg/2816017, tg/2816959, and tg/2828062 from upstream-staging.
The following kernel changes were backported to RHEL10.2, which require corresponding GVE changes when compiling against this kernel. - "net: hold netdev instance lock during queue operations"
The following changes were backported to RHEL10.2, which require corresponding GVE changes when compiling against this kernel. - xdp: double protect netdev->xdp_flags with netdev->lock - xdp: create locked/unlocked instances of xdp redirect target setters
Introduce a upstream_test command to build.sh that packs the upstream driver (without coccinelle pre-processing) and Makefile.oot to build a kernel module for upstream-staging testing. Also, the current deploy_to_cns kokoro workflows are broken because the `build/` folder is not available when we call `build_src.sh` without `-r`. Update the folder name in kokoro/build.sh to fix this.
kernel v7.0.0 added a series of new memory allocation APIs like kmalloc_objs. Add a cocci patch generated by Gemini for this.
Coccinelle was set up to run in parallel for C files and header files, but not per individual file. This could cause a bottleneck in processing, causing Coccinelle to sometimes takes upwards of a minute to execute. Cut down the run time by running the suite of coccinelle patches on each source file in parallel. This change results in a roughly 3x improvement in execution time on my 12-core, 24-thread workstation. Execution time before: real 1m1.079s user 0m54.183s sys 0m9.117s Execution time after: real 0m18.629s user 1m12.268s sys 0m23.229s Signed-off-by: Joshua Washington <joshwash@google.com> (cherry picked from commit 10fb17d20a4fd5f906508ac370fbef2cae72e737)
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
hramamurthy12
approved these changes
May 8, 2026
added 2 commits
May 8, 2026 10:49
Add release updates for previous releases.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.