dump_stack()

发布于:2023-01-22 ⋅ 阅读:(386) ⋅ 点赞:(0)

活动地址:CSDN21天学习挑战赛

函数调用关系

要想搞清楚一个工程中的函数调用关系,可以从程序入口开始分析,将一层层调用捋清楚。这种方式对付小工程还凑合,对付像 Linux kernel 这种庞大且复杂的工程,就非常吃力了。一个最显著的困难是,条件分支太多,函数调用关系错综复杂,分析起来头大。
那么,有没有一种工具,能够帮助我们自动分析函数调用关系,而不用我们亲自一层一层地找。答案是有的,那就是 dump_stack()。

dump_stack()

当你想知道 func_a() 函数是怎样被上级一级级调用的,就可以在 func_a() 中添加 dumo_stack()。
比如我想分析内核网络协议栈 TX 方向的 flow,就可以在 __netdev_start_xmit() 函数(该函数是 TX 方向内核网络协议栈和驱动的连接点,参考这里)里面添加 dumo_stack()。
注意,dump_stack() 只能用于内核空间。

示例

我将 dumo_stack() 添加在了 netdev_start_xmit(),效果一样,netdev_start_xmit() 紧接着就会调用 __netdev_start_xmit()

static inline netdev_tx_t netdev_start_xmit(struct sk_buff *skb, struct net_device *dev,
					    struct netdev_queue *txq, bool more)
{
	const struct net_device_ops *ops = dev->netdev_ops;
	int rc;

printk("[xxx-dump] in %s, line = %d, dump start\n", __func__, __LINE__);
dump_stack();
printk("[xxx-dump] in %s, line = %d, dump end\n", __func__, __LINE__);

	rc = __netdev_start_xmit(ops, skb, dev, more);
	if (rc == NETDEV_TX_OK)
		txq_trans_update(txq);

	return rc;
}

重新编译内核,运行

[xxx-dump] in netdev_start_xmit, line = 3623, dump start
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.1.15 #9
Hardware name: Freescale i.MX6 Ultralite (Device Tree)
[<80015ed4>] (unwind_backtrace) from [<80012794>] (show_stack+0x10/0x14)
[<80012794>] (show_stack) from [<8068ca34>] (dump_stack+0x80/0xc8)
[<8068ca34>] (dump_stack) from [<80549f74>] (dev_hard_start_xmit+0x228/0x35c)
[<80549f74>] (dev_hard_start_xmit) from [<80562fe4>] (sch_direct_xmit+0xc4/0x1f4)
[<80562fe4>] (sch_direct_xmit) from [<8054a2d8>] (__dev_queue_xmit+0x230/0x54c)
[<8054a2d8>] (__dev_queue_xmit) from [<805c92ac>] (ip6_finish_output2+0x164/0x608)
[<805c92ac>] (ip6_finish_output2) from [<805cd254>] (ip6_output+0xb4/0x198)
[<805cd254>] (ip6_output) from [<805e0e38>] (ndisc_send_skb+0x324/0x3a0)
[<805e0e38>] (ndisc_send_skb) from [<805e1a74>] (ndisc_send_ns+0xe0/0x158)
[<805e1a74>] (ndisc_send_ns) from [<805e1be8>] (ndisc_solicit+0xfc/0x124)
[<805e1be8>] (ndisc_solicit) from [<80551bb0>] (neigh_probe+0x4c/0x7c)
[<80551bb0>] (neigh_probe) from [<80553be8>] (neigh_timer_handler+0x204/0x28c)
[<80553be8>] (neigh_timer_handler) from [<8007d61c>] (call_timer_fn+0x24/0x9c)
[<8007d61c>] (call_timer_fn) from [<8007dbc8>] (run_timer_softirq+0x1b8/0x250)
[<8007dbc8>] (run_timer_softirq) from [<8003b3e4>] (__do_softirq+0xf0/0x228)
[<8003b3e4>] (__do_softirq) from [<8003b7ac>] (irq_exit+0xb0/0xfc)
[<8003b7ac>] (irq_exit) from [<8006f61c>] (__handle_domain_irq+0x70/0xe8)
[<8006f61c>] (__handle_domain_irq) from [<80009440>] (gic_handle_irq+0x20/0x60)
[<80009440>] (gic_handle_irq) from [<80013240>] (__irq_svc+0x40/0x74)
Exception stack(0x80975f30 to 0x80975f78)
5f20:                                     80975f78 00000009 0a73caeb 0000000b
5f40: 0a4c8d58 0000000b 00000002 97b91dd8 0a73caeb 0000000b 80971740 00000001
5f60: 00000017 80975f78 a6aaaaab 804a6e6c 20010013 ffffffff
[<80013240>] (__irq_svc) from [<804a6e6c>] (cpuidle_enter_state+0xcc/0x1f8)
[<804a6e6c>] (cpuidle_enter_state) from [<800671ac>] (cpu_startup_entry+0x1c0/0x324)
[<800671ac>] (cpu_startup_entry) from [<8090fbec>] (start_kernel+0x338/0x3a4)
[xxx-dump] in netdev_start_xmit, line = 3625, dump end

函数调用关系瞬间就出来了,从 start_kernel() 一直到 netdev_start_xmit(),然后再根据这个调用关系看代码,就很顺畅了。如果直接分析代码,很容易分析错,而有了这个调用关系做辅助,就非常顺利、正确且高效。
上述调用关系,前半段因为牵扯到中断,不是常规的函数调用,分析起来仍然会有些吃力。不过反过来想,如果没有 dump_stack() 加持,我们是不是根本想不到 cpuidle_enter_state() 和 __irq_svc() 竟然有关联,要是仅仅坐在那盯着代码去思考,恐怕会走火入魔吧。

技巧

另外,上述示例中函数调用关系的下半段,在分析时发现,ip6_output() 里面根本没有 ip6_finish_output2(),这是怎么回事呢?

int ip6_output(struct net *net, struct sock *sk, struct sk_buff *skb)
{
	struct net_device *dev = skb_dst(skb)->dev;
	struct inet6_dev *idev = ip6_dst_idev(skb_dst(skb));

	if (unlikely(idev->cnf.disable_ipv6)) {
		IP6_INC_STATS(net, idev, IPSTATS_MIB_OUTDISCARDS);
		kfree_skb(skb);
		return 0;
	}

	return NF_HOOK_COND(NFPROTO_IPV6, NF_INET_POST_ROUTING,
			    net, sk, skb, NULL, dev,
			    ip6_finish_output,
			    !(IP6CB(skb)->flags & IP6SKB_REROUTED));
}

那是因为,编译器在编译时会对代码进行优化以提高运行效率和代码紧凑度,但这对调试学习内核就不友好了。因此,我们在调试内核时,建议关闭这些优化。

  1. 优化级别从 O2 改为 O1
    Makefile
# KBUILD_CFLAGS	+= -O2
KBUILD_CFLAGS	+= -O1
  1. 不要将只有一个地方调用的函数自动变为 inline 函数
# CONFIG_DEBUG_SECTION_MISMATCH=n
CONFIG_DEBUG_SECTION_MISMATCH=y

# We trigger additional mismatches with less inlining
ifdef CONFIG_DEBUG_SECTION_MISMATCH
KBUILD_CFLAGS += $(call cc-option, -fno-inline-functions-called-once)
endif
  1. 手动去除部分函数的 inline 修饰符

这样再次编译、运行

[xxx-dump] in netdev_start_xmit, line = 3623, dump start
CPU: 0 PID: 541 Comm: connmand Not tainted 4.1.15 #10
Hardware name: Freescale i.MX6 Ultralite (Device Tree)
[<80016850>] (unwind_backtrace) from [<800130fc>] (show_stack+0x10/0x14)
[<800130fc>] (show_stack) from [<802bb7a0>] (__dump_stack+0x18/0x20)
[<802bb7a0>] (__dump_stack) from [<802bb810>] (dump_stack+0x68/0xb8)
[<802bb810>] (dump_stack) from [<80588830>] (netdev_start_xmit+0x30/0x9c)
[<80588830>] (netdev_start_xmit) from [<8058b67c>] (xmit_one+0x64/0x74)
[<8058b67c>] (xmit_one) from [<8058f2cc>] (dev_hard_start_xmit+0x40/0x90)
[<8058f2cc>] (dev_hard_start_xmit) from [<805aa0bc>] (sch_direct_xmit+0x94/0x208)
[<805aa0bc>] (sch_direct_xmit) from [<8058f540>] (__dev_queue_xmit+0x224/0x4bc)
[<8058f540>] (__dev_queue_xmit) from [<8058f7f8>] (dev_queue_xmit_sk+0x20/0x2c)
[<8058f7f8>] (dev_queue_xmit_sk) from [<806152c8>] (dev_queue_xmit+0x10/0x14)
[<806152c8>] (dev_queue_xmit) from [<80615374>] (neigh_hh_output+0xa8/0xac)
[<80615374>] (neigh_hh_output) from [<806153cc>] (dst_neigh_output+0x54/0x70)
[<806153cc>] (dst_neigh_output) from [<806157f8>] (ip6_finish_output2+0x410/0x540)
[<806157f8>] (ip6_finish_output2) from [<8061962c>] (ip6_finish_output+0x140/0x150)
[<8061962c>] (ip6_finish_output) from [<806197b4>] (ip6_output+0x178/0x194)
[<806197b4>] (ip6_output) from [<8064d2ec>] (ip6_local_out_sk+0x38/0x3c)
[<8064d2ec>] (ip6_local_out_sk) from [<8064d300>] (ip6_local_out+0x10/0x14)
[<8064d300>] (ip6_local_out) from [<80619d18>] (ip6_send_skb+0xc/0x108)
[<80619d18>] (ip6_send_skb) from [<806328ac>] (udp_v6_send_skb+0x280/0x298)
[<806328ac>] (udp_v6_send_skb) from [<80632fa4>] (udpv6_sendmsg+0x618/0xaa0)
[<80632fa4>] (udpv6_sendmsg) from [<805ef7b4>] (inet_sendmsg+0xac/0xc0)
[<805ef7b4>] (inet_sendmsg) from [<805750e8>] (sock_sendmsg+0x14/0x24)
[<805750e8>] (sock_sendmsg) from [<80576c20>] (SyS_sendto+0xb8/0xdc)
[<80576c20>] (SyS_sendto) from [<8000f5a0>] (ret_fast_syscall+0x0/0x3c)
[xxx-dump] in netdev_start_xmit, line = 3625, dump end

可以看到,ip6_output() 先是调用 ip6_finish_output(),通过该函数再调用 ip6_finish_output2(),这就和代码完全一致了。
不过由于 dump_stack() 展示的信息量有限,调用关系详细了之后,深度就没那么深了,你看,这次就没有从 start_kernel() 开始展示。
我们在具体调试时,可以先不修改编译优化选项,以保护函数调用的整体深度,哪里分析不通时,再修改编译选项,展示局部详尽的调用关系。
总之,有了 dump_stack(),我们就多了件分析代码的法宝。
能力不行,法宝来凑。就像西游记里的青牛怪,能力平平,但手里有件法宝——金刚圈,孙悟空的金箍棒都被套了去,简直是套圈神器。有了这样的法宝,干起活来那还不是如鱼得水。

本文含有隐藏内容,请 开通VIP 后查看

网站公告

今日签到

点亮在社区的每一天
去签到