linux/include/trace/events
David Howells 1d0013962d
netfs: Fix a number of read-retry hangs
Fix a number of hangs in the netfslib read-retry code, including:

 (1) netfs_reissue_read() doubles up the getting of references on
     subrequests, thereby leaking the subrequest and causing inode eviction
     to wait indefinitely.  This can lead to the kernel reporting a hang in
     the filesystem's evict_inode().

     Fix this by removing the get from netfs_reissue_read() and adding one
     to netfs_retry_read_subrequests() to deal with the one place that
     didn't double up.

 (2) The loop in netfs_retry_read_subrequests() that retries a sequence of
     failed subrequests doesn't record whether or not it retried the one
     that the "subreq" pointer points to when it leaves the loop.  It may
     not if renegotiation/repreparation of the subrequests means that fewer
     subrequests are needed to span the cumulative range of the sequence.

     Because it doesn't record this, the piece of code that discards
     now-superfluous subrequests doesn't know whether it should discard the
     one "subreq" points to - and so it doesn't.

     Fix this by noting whether the last subreq it examines is superfluous
     and if it is, then getting rid of it and all subsequent subrequests.

     If that one one wasn't superfluous, then we would have tried to go
     round the previous loop again and so there can be no further unretried
     subrequests in the sequence.

 (3) netfs_retry_read_subrequests() gets yet an extra ref on any additional
     subrequests it has to get because it ran out of ones it could reuse to
     to renegotiation/repreparation shrinking the subrequests.

     Fix this by removing that extra ref.

 (4) In netfs_retry_reads(), it was using wait_on_bit() to wait for
     NETFS_SREQ_IN_PROGRESS to be cleared on all subrequests in the
     sequence - but netfs_read_subreq_terminated() is now using a wait
     queue on the request instead and so this wait will never finish.

     Fix this by waiting on the wait queue instead.  To make this work, a
     new flag, NETFS_RREQ_RETRYING, is now set around the wait loop to tell
     the wake-up code to wake up the wait queue rather than requeuing the
     request's work item.

     Note that this flag replaces the NETFS_RREQ_NEED_RETRY flag which is
     no longer used.

 (5) Whilst not strictly anything to do with the hang,
     netfs_retry_read_subrequests() was also doubly incrementing the
     subreq_counter and re-setting the debug index, leaving a gap in the
     trace.  This is also fixed.

One of these hangs was observed with 9p and with cifs.  Others were forced
by manual code injection into fs/afs/file.c.  Firstly, afs_prepare_read()
was created to provide an changing pattern of maximum subrequest sizes:

	static int afs_prepare_read(struct netfs_io_subrequest *subreq)
	{
		struct netfs_io_request *rreq = subreq->rreq;
		if (!S_ISREG(subreq->rreq->inode->i_mode))
			return 0;
		if (subreq->retry_count < 20)
			rreq->io_streams[0].sreq_max_len =
				umax(200, 2222 - subreq->retry_count * 40);
		else
			rreq->io_streams[0].sreq_max_len = 3333;
		return 0;
	}

and pointed to by afs_req_ops.  Then the following:

	struct netfs_io_subrequest *subreq = op->fetch.subreq;
	if (subreq->error == 0 &&
	    S_ISREG(subreq->rreq->inode->i_mode) &&
	    subreq->retry_count < 20) {
		subreq->transferred = subreq->already_done;
		__clear_bit(NETFS_SREQ_HIT_EOF, &subreq->flags);
		__set_bit(NETFS_SREQ_NEED_RETRY, &subreq->flags);
		afs_fetch_data_notify(op);
		return;
	}

was inserted into afs_fetch_data_success() at the beginning and struct
netfs_io_subrequest given an extra field, "already_done" that was set to
the value in "subreq->transferred" by netfs_reissue_read().

When reading a 4K file, the subrequests would get gradually smaller, a new
subrequest would be allocated around the 3rd retry and then eventually be
rendered superfluous when the 20th retry was hit and the limit on the first
subrequest was eased.

Fixes: e2d46f2ec3 ("netfs: Change the read result collector to only use one work item")
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20250212222402.3618494-2-dhowells@redhat.com
Tested-by: Marc Dionne <marc.dionne@auristor.com>
Tested-by: Steve French <stfrench@microsoft.com>
cc: Ihor Solodrai <ihor.solodrai@pm.me>
cc: Eric Van Hensbergen <ericvh@kernel.org>
cc: Latchesar Ionkov <lucho@ionkov.net>
cc: Dominique Martinet <asmadeus@codewreck.org>
cc: Christian Schoenebeck <linux_oss@crudebyte.com>
cc: Paulo Alcantara <pc@manguebit.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: v9fs@lists.linux.dev
cc: linux-cifs@vger.kernel.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-13 16:00:38 +01:00
..
9p.h 9p: prevent read overrun in protocol dump tracepoint 2023-12-05 21:18:44 +09:00
afs.h vfs-6.14-rc1.afs 2025-01-20 11:40:48 -08:00
alarmtimer.h
amdxdna.h accel/amdxdna: Add command execution 2024-11-22 11:43:27 -07:00
asoc.h ALSA: trace: use snd_pcm_direction_name() 2024-08-01 12:50:03 +02:00
avc.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
bcache.h
block.h block: remove the ioprio field from struct request 2024-11-12 14:42:02 -07:00
bpf_test_run.h bpf: add bpf_modify_return_test_tp() kfunc triggering tracepoint 2024-03-28 18:31:40 -07:00
bridge.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
btrfs.h btrfs: zoned: reclaim unused zone by zone resetting 2025-01-13 14:53:14 +01:00
cachefiles.h cachefiles: Add auxiliary data trace 2024-12-20 22:34:05 +01:00
capability.h security: add trace event for cap_capable 2024-12-04 20:59:21 -06:00
cgroup.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
clk.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
cma.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
compaction.h mm: compaction: update the cc->nr_migratepages when allocating or freeing the freepages 2024-02-22 10:24:50 -08:00
context_tracking.h
cpuhp.h
csd.h smp: Change function signatures to use call_single_data_t 2023-09-13 14:59:24 +02:00
damon.h mm/damon: fix order of arguments in damos_before_apply tracepoint 2024-12-05 19:54:47 -08:00
devfreq.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
devlink.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
dlm.h dlm: remove lkb from callback tracepoints 2024-04-01 13:31:12 -05:00
dma.h dma-mapping: trace more error paths 2024-10-29 08:54:06 +01:00
dma_fence.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
erofs.h erofs: get rid of z_erofs_map_blocks_iter_* tracepoints 2024-07-10 18:57:06 +08:00
error_report.h
ext4.h ext4: remove tracing for FALLOC_FL_NO_HIDE_STALE 2024-08-28 16:53:57 +02:00
f2fs.h f2fs: Remove calls to folio_file_mapping() 2024-12-16 16:12:26 +00:00
fib.h net: Replace strlcpy with strscpy 2023-07-04 19:40:16 +01:00
fib6.h tracing: ipv6: Add flow label to fib6_table_lookup tracepoint 2024-12-19 16:02:22 +01:00
filelock.h filelock: split leases out of struct file_lock 2024-02-05 13:11:44 +01:00
filemap.h filemap: add trace events for get_pages, map_pages, and fault 2024-09-01 20:26:10 -07:00
firewire.h firewire: core: rename cause flag of tracepoints event 2024-09-12 22:30:38 +09:00
firewire_ohci.h firewire: ohci: add tracepoints event for data of Self-ID DMA 2024-07-04 09:07:14 +09:00
fs_dax.h dax: use huge_zero_folio 2024-04-25 20:56:20 -07:00
fscache.h cachefiles: fix slab-use-after-free in fscache_withdraw_volume() 2024-07-03 10:36:14 +02:00
fsi.h fsi: core: Add trace events for scan and unregister 2023-08-09 15:43:28 +09:30
fsi_master_aspeed.h
fsi_master_ast_cf.h
fsi_master_gpio.h
fsi_master_i2cr.h fsi: Add IBM I2C Responder virtual FSI master 2023-08-11 13:32:14 +09:30
gpio.h
gpu_mem.h
habanalabs.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
handshake.h net/handshake: Trace events for TLS Alert helpers 2023-07-28 14:07:59 -07:00
host1x.h
huge_memory.h mm: khugepaged: fix the arguments order in khugepaged_collapse_file trace point 2024-10-17 00:28:09 -07:00
hugetlbfs.h hugetlb: fix NULL pointer dereference in trace_hugetlbfs_alloc_inode 2025-01-12 19:03:36 -08:00
hw_pressure.h sched/cpufreq: Rename arch_update_thermal_pressure() => arch_update_hw_pressure() 2024-04-24 12:08:01 +02:00
hwmon.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
i2c.h
i2c_slave.h
ib_mad.h IB/mad: Don't call to function that might sleep while in atomic context 2022-11-10 10:57:15 +02:00
ib_umad.h
icmp.h net/ipv4: add tracepoint for icmp_send 2024-05-08 10:39:26 +01:00
initcall.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
intel-sst.h
intel_ifs.h trace: platform/x86/intel/ifs: Add SBAF trace support 2024-08-12 16:36:11 +02:00
intel_ish.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
io_uring.h io_uring: clean up cqe trace points 2024-10-29 13:43:27 -06:00
iocost.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
iommu.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
ipi.h trace: Add trace_ipi_send_cpu() 2023-03-24 11:01:29 +01:00
irq.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
irq_matrix.h
iscsi.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
jbd2.h jbd2: remove journal_clean_one_cp_list() 2023-07-10 23:09:21 -04:00
kmem.h mm: remove CONFIG_MEMCG_KMEM 2024-07-10 12:14:54 -07:00
ksm.h mm/ksm: add tracepoint for ksm advisor 2023-12-29 11:58:27 -08:00
kvm.h LoongArch: KVM: Add iocsr and mmio bus simulation in kernel 2024-11-13 16:18:26 +08:00
kyber.h kyber: Replace strlcpy with strscpy 2023-07-17 08:18:17 -06:00
libata.h ata: libata: add qc->flags in ata_qc_complete_template tracepoint 2022-06-17 16:30:03 +09:00
lock.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
maple_tree.h Maple Tree: add new data structure 2022-09-26 19:46:13 -07:00
mce.h x86/MCE/AMD: Add support for new MCA_SYND{1,2} registers 2024-10-31 10:36:07 +01:00
mctp.h
mdio.h trace: events: cleanup deprecated strncpy uses 2024-04-05 22:10:25 -07:00
memcg.h memcg: add flush tracepoint 2024-11-11 00:26:46 -08:00
migrate.h mm/migrate: add MR_DAMON to migrate_reason 2024-07-03 19:30:12 -07:00
mlxsw.h
mmap.h mm: mmap: remove newline at the end of the trace 2023-03-23 17:18:36 -07:00
mmap_lock.h mm: mmap_lock: optimize mmap_lock tracepoints 2025-01-13 22:40:34 -08:00
mmc.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
mmflags.h The various patchsets are summarized below. Plus of course many 2025-01-26 18:36:23 -08:00
module.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
mptcp.h mptcp: sched: check both directions for backup 2024-07-30 10:27:29 +02:00
napi.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
nbd.h nbd: Use NULL to represent a pointer 2024-05-14 07:22:35 -06:00
neigh.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
net.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
net_probe_common.h trace: adjust TP_STORE_ADDR_PORTS_SKB() parameters 2024-04-03 19:26:14 -07:00
netfs.h netfs: Fix a number of read-retry hangs 2025-02-13 16:00:38 +01:00
netlink.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
nilfs2.h nilfs2: use __field_struct() for a bitwise field 2024-05-11 15:51:43 -07:00
nmi.h
notifier.h notifiers: add tracepoints to the notifiers infrastructure 2023-04-08 13:45:38 -07:00
objagg.h
oom.h mm: improve code consistency with zonelist_* helper functions 2024-09-01 20:25:55 -07:00
osnoise.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
page_isolation.h
page_pool.h page_pool: devmem support 2024-09-11 20:44:31 -07:00
page_ref.h trace/events/page_ref: trace the raw page mapcount value 2024-05-05 17:53:31 -07:00
pagemap.h
percpu.h include/trace/events/percpu.h: cleanup for "percpu: improve percpu_alloc_percpu event trace" 2022-05-25 10:47:48 -07:00
power.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
power_cpu_migrate.h
preemptirq.h tracing: Remove definition of trace_*_rcuidle() 2024-10-08 21:17:39 -04:00
printk.h
pwc.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
pwm.h pwm: Add tracing for waveform callbacks 2024-09-28 15:13:56 +02:00
qdisc.h tracing/net_sched: NULL pointer dereference in perf_trace_qdisc_reset() 2024-06-27 11:06:30 +02:00
qla.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
qrtr.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
rcu.h context_tracking, rcu: Rename rcu_dyntick trace event into rcu_watching 2024-08-15 21:30:43 +05:30
rdma_core.h
regulator.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
rpcgss.h SUNRPC: Fixup gss_status tracepoint error output 2024-07-18 10:49:15 -04:00
rpcrdma.h svcrdma: Handle device removal outside of the CM event handler 2024-09-20 19:31:03 -04:00
rpm.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
rseq.h tracing/rseq: Add mm_cid field to rseq_update 2022-12-27 12:52:15 +01:00
rtc.h
rust_sample.h rust: samples: add tracepoint to Rust sample 2024-11-04 16:21:44 -05:00
rwmmio.h asm-generic/io: Add _RET_IP_ to MMIO trace for more accurate debug info 2022-11-21 22:02:10 +01:00
rxrpc.h rxrpc: Fix the rxrpc_connection attend queue handling 2025-02-04 15:30:28 +01:00
sched.h tracing/sched: sched_switch: place prev_comm and next_comm in right order 2024-07-15 15:01:01 -04:00
sched_ext.h sched_ext: Print debug dump after an error exit 2024-06-18 10:09:18 -10:00
scmi.h include: trace: Widen the tag buffer in trace_scmi_dump_msg 2024-03-26 11:17:40 +00:00
scsi.h scsi: sd: Atomic write support 2024-06-20 15:19:17 -06:00
sctp.h
signal.h
siox.h
skb.h net: add rx_sk to trace_kfree_skb 2024-06-19 12:44:22 +01:00
smbus.h
sock.h trace: events: cleanup deprecated strncpy uses 2024-04-05 22:10:25 -07:00
sof.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
sof_intel.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
spi.h spi: Fix spelling typos and acronyms capitalization 2023-07-11 14:14:32 +01:00
spmi.h spmi: trace: fix stack-out-of-bound access in SPMI tracing functions 2022-07-24 16:16:44 +02:00
sunrpc.h sunrpc: remove newlines from tracepoints 2024-11-08 14:26:21 -05:00
sunvnet.h
swiotlb.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
syscalls.h tracing: Declare system call tracepoints with TRACE_EVENT_SYSCALL 2024-10-09 17:05:54 -04:00
target.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
task.h tracing: Remove pid in task_rename tracing output 2024-12-22 20:28:11 -08:00
tcp.h tcp: Use skb__nullable in trace_tcp_send_reset 2024-09-11 08:56:42 -07:00
tegra_apb_dma.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
thp.h powerpc/book3s64/mm: enable transparent pud hugepage 2023-08-18 10:12:55 -07:00
timer.h tracing/timers: Add tracepoint for tracking timer base is_idle flag 2023-12-20 16:49:38 +01:00
timer_migration.h timers/migration: Rename childmask by groupmask to make naming more obvious 2024-07-22 18:03:34 +02:00
timestamp.h fs: tracepoints around multigrain timestamp events 2024-10-10 10:20:52 +02:00
tlb.h
udp.h trace: adjust TP_STORE_ADDR_PORTS_SKB() parameters 2024-04-03 19:26:14 -07:00
v4l2.h
vb2.h
vmalloc.h mm: vmalloc: add free_vmap_area_noflush trace event 2022-11-08 17:37:17 -08:00
vmscan.h vmscan: add a vmscan event for reclaim_pages 2024-11-06 20:11:13 -08:00
vsock_virtio_transport_common.h vsock/virtio: MSG_ZEROCOPY flag support 2023-09-21 12:34:00 +02:00
watchdog.h watchdog: Add tracing events for the most usual watchdog events 2022-10-12 09:47:02 +02:00
wbt.h blk-wbt: Replace strlcpy with strscpy 2023-07-17 08:18:17 -06:00
workqueue.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
writeback.h writeback: Refine the show_inode_state() macro definition 2024-08-30 08:22:41 +02:00
xdp.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
xen.h x86/xen: move paravirt lazy code 2023-09-19 07:04:49 +02:00