[关闭]
@yiltoncent 2015-11-30T15:08:16.000000Z 字数 8745 阅读 2908

可执行程序中的段 (C专家编程)

C语言基础 LINUX


昨天看《C专家编程》,其中第六章第二节给出一个编程挑战,本来觉得是很容易的问题,结果结果出乎我的意料。废话不多说,先看题目

查看可执行文件终的段
1. 编译“Hello world”程序,在可执行文件终执行ls -l,得到文件的总体大小。运行size得到各段的大小。
2. 增加一个全局的int[1000]数组声明,重新编译,再用上面的命令得到总体及各个段的大小,注意前后的区别。
3. 现在,在数组的声明中增加初始值。这将使数组从BSS段转到数据段。重复上面的测量,注意各段前后大小的变化。

按照我之前的了解,2的BSS段会比1多4×1000=4000字节,3的DATA段比1和2的多4000字节。而实际情况呢?让我们用实验来看看

hello world 1

  1. #include <stdio.h>
  2. int main(void)
  3. {
  4. printf("Hello World!\n");
  5. return 0;
  6. }

2015-11-24 23-18-28屏幕截图.png-49kB
可以看出:

text data bss
1137 280 4

hello world 2

  1. #include <stdio.h>
  2. int a[1000];
  3. int main(void)
  4. {
  5. printf("Hello World!\n");
  6. return 0;
  7. }

2015-11-24 23-13-47屏幕截图.png-50.7kB

可以看出:

text data bss
1137 280 4032

这里BSS大小为4032,比刚开始想的还多出了28个字节,这是怎么回事?

hello world 3

  1. #include <stdio.h>
  2. int a[1000]={1};
  3. int main(void)
  4. {
  5. printf("Hello World!\n");
  6. return 0;
  7. }

2015-11-24 23-20-30屏幕截图.png-50.2kB

可以看出:

text data bss
1137 4304 4

这里DATA大小为4304,4304-280=4024,比刚开始想的还多出了24个字节,这又是怎么回事?

分析

这里我把三个目标文件用nm程序打印出详细的段信息。

  1. yiltoncent@yiltoncent-GA-MA785GM-US2H:~/ctest/expert_c_programming_deep_c_secrets$ nm -s ch6_2_1.o
  2. 0804a020 B __bss_start
  3. 0804a020 b completed.7181
  4. 0804a018 D __data_start
  5. 0804a018 W data_start
  6. 08048360 t deregister_tm_clones
  7. 080483d0 t __do_global_dtors_aux
  8. 08049f0c t __do_global_dtors_aux_fini_array_entry
  9. 0804a01c D __dso_handle
  10. 08049f14 d _DYNAMIC
  11. 0804a020 D _edata
  12. 0804a024 B _end
  13. 080484b4 T _fini
  14. 080484c8 R _fp_hw
  15. 080483f0 t frame_dummy
  16. 08049f08 t __frame_dummy_init_array_entry
  17. 080485d4 r __FRAME_END__
  18. 0804a000 d _GLOBAL_OFFSET_TABLE_
  19. w __gmon_start__
  20. 080482b0 T _init
  21. 08049f0c t __init_array_end
  22. 08049f08 t __init_array_start
  23. 080484cc R _IO_stdin_used
  24. w _ITM_deregisterTMCloneTable
  25. w _ITM_registerTMCloneTable
  26. 08049f10 d __JCR_END__
  27. 08049f10 d __JCR_LIST__
  28. w _Jv_RegisterClasses
  29. 080484b0 T __libc_csu_fini
  30. 08048450 T __libc_csu_init
  31. U __libc_start_main@@GLIBC_2.0
  32. 0804841b T main
  33. U puts@@GLIBC_2.0
  34. 08048390 t register_tm_clones
  35. 08048320 T _start
  36. 0804a020 D __TMC_END__
  37. 08048350 T __x86.get_pc_thunk.bx
  38. yiltoncent@yiltoncent-GA-MA785GM-US2H:~/ctest/expert_c_programming_deep_c_secrets$ nm -s ch6_2_2.o
  39. 0804a040 B a
  40. 0804a020 B __bss_start
  41. 0804a020 b completed.7181
  42. 0804a018 D __data_start
  43. 0804a018 W data_start
  44. 08048360 t deregister_tm_clones
  45. 080483d0 t __do_global_dtors_aux
  46. 08049f0c t __do_global_dtors_aux_fini_array_entry
  47. 0804a01c D __dso_handle
  48. 08049f14 d _DYNAMIC
  49. 0804a020 D _edata
  50. 0804afe0 B _end
  51. 080484b4 T _fini
  52. 080484c8 R _fp_hw
  53. 080483f0 t frame_dummy
  54. 08049f08 t __frame_dummy_init_array_entry
  55. 080485d4 r __FRAME_END__
  56. 0804a000 d _GLOBAL_OFFSET_TABLE_
  57. w __gmon_start__
  58. 080482b0 T _init
  59. 08049f0c t __init_array_end
  60. 08049f08 t __init_array_start
  61. 080484cc R _IO_stdin_used
  62. w _ITM_deregisterTMCloneTable
  63. w _ITM_registerTMCloneTable
  64. 08049f10 d __JCR_END__
  65. 08049f10 d __JCR_LIST__
  66. w _Jv_RegisterClasses
  67. 080484b0 T __libc_csu_fini
  68. 08048450 T __libc_csu_init
  69. U __libc_start_main@@GLIBC_2.0
  70. 0804841b T main
  71. U puts@@GLIBC_2.0
  72. 08048390 t register_tm_clones
  73. 08048320 T _start
  74. 0804a020 D __TMC_END__
  75. 08048350 T __x86.get_pc_thunk.bx
  76. yiltoncent@yiltoncent-GA-MA785GM-US2H:~/ctest/expert_c_programming_deep_c_secrets$ nm -s ch6_2_3.o
  77. 0804a040 D a
  78. 0804afe0 B __bss_start
  79. 0804afe0 b completed.7181
  80. 0804a020 D __data_start
  81. 0804a020 W data_start
  82. 08048360 t deregister_tm_clones
  83. 080483d0 t __do_global_dtors_aux
  84. 08049f0c t __do_global_dtors_aux_fini_array_entry
  85. 0804a024 D __dso_handle
  86. 08049f14 d _DYNAMIC
  87. 0804afe0 D _edata
  88. 0804afe4 B _end
  89. 080484b4 T _fini
  90. 080484c8 R _fp_hw
  91. 080483f0 t frame_dummy
  92. 08049f08 t __frame_dummy_init_array_entry
  93. 080485d4 r __FRAME_END__
  94. 0804a000 d _GLOBAL_OFFSET_TABLE_
  95. w __gmon_start__
  96. 080482b0 T _init
  97. 08049f0c t __init_array_end
  98. 08049f08 t __init_array_start
  99. 080484cc R _IO_stdin_used
  100. w _ITM_deregisterTMCloneTable
  101. w _ITM_registerTMCloneTable
  102. 08049f10 d __JCR_END__
  103. 08049f10 d __JCR_LIST__
  104. w _Jv_RegisterClasses 080484b0 T __libc_csu_fini
  105. 08048450 T __libc_csu_init
  106. U __libc_start_main@@GLIBC_2.0
  107. 0804841b T main
  108. U puts@@GLIBC_2.0
  109. 08048390 t register_tm_clones
  110. 08048320 T _start
  111. 0804afe0 D __TMC_END__
  112. 08048350 T __x86.get_pc_thunk.bx

对比1和2,我们关注BSS段,注意上面每一行的中间,B就表示BSS段。
如下,对于1,注意__bss_start_end的地址。

  1. yiltoncent@yiltoncent-GA-MA785GM-US2H:~/ctest/expert_c_programming_deep_c_secrets$ nm -s ch6_2_1.o
  2. 0804a020 B __bss_start
  3. 0804a020 b completed.7181
  4. 0804a018 D __data_start
  5. 0804a018 W data_start
  6. 08048360 t deregister_tm_clones
  7. 080483d0 t __do_global_dtors_aux
  8. 08049f0c t __do_global_dtors_aux_fini_array_entry
  9. 0804a01c D __dso_handle
  10. 08049f14 d _DYNAMIC
  11. 0804a020 D _edata
  12. 0804a024 B _end

计算0x804a024 - 0x804a020 = 4,正好是BSS的大小。

如下,对于2,注意__bss_start_end的地址。

  1. yiltoncent@yiltoncent-GA-MA785GM-US2H:~/ctest/expert_c_programming_deep_c_secrets$ nm -s ch6_2_2.o
  2. 0804a040 B a
  3. 0804a020 B __bss_start
  4. 0804a020 b completed.7181
  5. 0804a018 D __data_start
  6. 0804a018 W data_start
  7. 08048360 t deregister_tm_clones
  8. 080483d0 t __do_global_dtors_aux
  9. 08049f0c t __do_global_dtors_aux_fini_array_entry
  10. 0804a01c D __dso_handle
  11. 08049f14 d _DYNAMIC
  12. 0804a020 D _edata
  13. 0804afe0 B _end

计算0804afe0 - 0804a020 = 0xfc0 = 4032,也正好是BSS的大小。同时请注意上面另外一个符号a,其地址是0x0804a040,与__bss_start相差大小正好为0x20 = 32个字节。编译输出为何要预留32个字节,用来干什么?

最终解决

最终问题搞清楚还是在《程序员的自我修养》的启示下,里面第三章第三节使用objdump工具分析目标文件。因此我也对1和2进行了分析。

  1. ch6_2_1.o 文件格式 elf32-i386
  2. 节:
  3. Idx Name Size VMA LMA File off Algn
  4. 0 .interp 00000013 08048154 08048154 00000154 2**0
  5. CONTENTS, ALLOC, LOAD, READONLY, DATA
  6. 1 .note.ABI-tag 00000020 08048168 08048168 00000168 2**2
  7. CONTENTS, ALLOC, LOAD, READONLY, DATA
  8. 2 .note.gnu.build-id 00000024 08048188 08048188 00000188 2**2
  9. CONTENTS, ALLOC, LOAD, READONLY, DATA
  10. 3 .gnu.hash 00000020 080481ac 080481ac 000001ac 2**2
  11. CONTENTS, ALLOC, LOAD, READONLY, DATA
  12. 4 .dynsym 00000050 080481cc 080481cc 000001cc 2**2
  13. CONTENTS, ALLOC, LOAD, READONLY, DATA
  14. 5 .dynstr 0000004a 0804821c 0804821c 0000021c 2**0
  15. CONTENTS, ALLOC, LOAD, READONLY, DATA
  16. 6 .gnu.version 0000000a 08048266 08048266 00000266 2**1
  17. CONTENTS, ALLOC, LOAD, READONLY, DATA
  18. 7 .gnu.version_r 00000020 08048270 08048270 00000270 2**2
  19. CONTENTS, ALLOC, LOAD, READONLY, DATA
  20. 8 .rel.dyn 00000008 08048290 08048290 00000290 2**2
  21. CONTENTS, ALLOC, LOAD, READONLY, DATA
  22. 9 .rel.plt 00000018 08048298 08048298 00000298 2**2
  23. CONTENTS, ALLOC, LOAD, READONLY, DATA
  24. 10 .init 00000023 080482b0 080482b0 000002b0 2**2
  25. CONTENTS, ALLOC, LOAD, READONLY, CODE
  26. 11 .plt 00000040 080482e0 080482e0 000002e0 2**4
  27. CONTENTS, ALLOC, LOAD, READONLY, CODE
  28. 12 .text 00000192 08048320 08048320 00000320 2**4
  29. CONTENTS, ALLOC, LOAD, READONLY, CODE
  30. 13 .fini 00000014 080484b4 080484b4 000004b4 2**2
  31. CONTENTS, ALLOC, LOAD, READONLY, CODE
  32. 14 .rodata 00000015 080484c8 080484c8 000004c8 2**2
  33. CONTENTS, ALLOC, LOAD, READONLY, DATA
  34. 15 .eh_frame_hdr 0000002c 080484e0 080484e0 000004e0 2**2
  35. CONTENTS, ALLOC, LOAD, READONLY, DATA
  36. 16 .eh_frame 000000cc 0804850c 0804850c 0000050c 2**2
  37. CONTENTS, ALLOC, LOAD, READONLY, DATA
  38. 17 .init_array 00000004 08049f08 08049f08 00000f08 2**2
  39. CONTENTS, ALLOC, LOAD, DATA
  40. 18 .fini_array 00000004 08049f0c 08049f0c 00000f0c 2**2
  41. CONTENTS, ALLOC, LOAD, DATA
  42. 19 .jcr 00000004 08049f10 08049f10 00000f10 2**2
  43. CONTENTS, ALLOC, LOAD, DATA
  44. 20 .dynamic 000000e8 08049f14 08049f14 00000f14 2**2
  45. CONTENTS, ALLOC, LOAD, DATA
  46. 21 .got 00000004 08049ffc 08049ffc 00000ffc 2**2
  47. CONTENTS, ALLOC, LOAD, DATA
  48. 22 .got.plt 00000018 0804a000 0804a000 00001000 2**2
  49. CONTENTS, ALLOC, LOAD, DATA
  50. 23 .data 00000008 0804a018 0804a018 00001018 2**2
  51. CONTENTS, ALLOC, LOAD, DATA
  52. 24 .bss 00000004 0804a020 0804a020 00001020 2**0
  53. ALLOC
  54. 25 .comment 00000052 00000000 00000000 00001020 2**0
  55. CONTENTS, READONLY
  56. ch6_2_2.o 文件格式 elf32-i386
  57. 节:
  58. Idx Name Size VMA LMA File off Algn
  59. 0 .interp 00000013 08048154 08048154 00000154 2**0
  60. CONTENTS, ALLOC, LOAD, READONLY, DATA
  61. 1 .note.ABI-tag 00000020 08048168 08048168 00000168 2**2
  62. CONTENTS, ALLOC, LOAD, READONLY, DATA
  63. 2 .note.gnu.build-id 00000024 08048188 08048188 00000188 2**2
  64. CONTENTS, ALLOC, LOAD, READONLY, DATA
  65. 3 .gnu.hash 00000020 080481ac 080481ac 000001ac 2**2
  66. CONTENTS, ALLOC, LOAD, READONLY, DATA
  67. 4 .dynsym 00000050 080481cc 080481cc 000001cc 2**2
  68. CONTENTS, ALLOC, LOAD, READONLY, DATA
  69. 5 .dynstr 0000004a 0804821c 0804821c 0000021c 2**0
  70. CONTENTS, ALLOC, LOAD, READONLY, DATA
  71. 6 .gnu.version 0000000a 08048266 08048266 00000266 2**1
  72. CONTENTS, ALLOC, LOAD, READONLY, DATA
  73. 7 .gnu.version_r 00000020 08048270 08048270 00000270 2**2
  74. CONTENTS, ALLOC, LOAD, READONLY, DATA
  75. 8 .rel.dyn 00000008 08048290 08048290 00000290 2**2
  76. CONTENTS, ALLOC, LOAD, READONLY, DATA
  77. 9 .rel.plt 00000018 08048298 08048298 00000298 2**2
  78. CONTENTS, ALLOC, LOAD, READONLY, DATA
  79. 10 .init 00000023 080482b0 080482b0 000002b0 2**2
  80. CONTENTS, ALLOC, LOAD, READONLY, CODE
  81. 11 .plt 00000040 080482e0 080482e0 000002e0 2**4
  82. CONTENTS, ALLOC, LOAD, READONLY, CODE
  83. 12 .text 00000192 08048320 08048320 00000320 2**4
  84. CONTENTS, ALLOC, LOAD, READONLY, CODE
  85. 13 .fini 00000014 080484b4 080484b4 000004b4 2**2
  86. CONTENTS, ALLOC, LOAD, READONLY, CODE
  87. 14 .rodata 00000015 080484c8 080484c8 000004c8 2**2
  88. CONTENTS, ALLOC, LOAD, READONLY, DATA
  89. 15 .eh_frame_hdr 0000002c 080484e0 080484e0 000004e0 2**2
  90. CONTENTS, ALLOC, LOAD, READONLY, DATA
  91. 16 .eh_frame 000000cc 0804850c 0804850c 0000050c 2**2
  92. CONTENTS, ALLOC, LOAD, READONLY, DATA
  93. 17 .init_array 00000004 08049f08 08049f08 00000f08 2**2
  94. CONTENTS, ALLOC, LOAD, DATA
  95. 18 .fini_array 00000004 08049f0c 08049f0c 00000f0c 2**2
  96. CONTENTS, ALLOC, LOAD, DATA
  97. 19 .jcr 00000004 08049f10 08049f10 00000f10 2**2
  98. CONTENTS, ALLOC, LOAD, DATA
  99. 20 .dynamic 000000e8 08049f14 08049f14 00000f14 2**2
  100. CONTENTS, ALLOC, LOAD, DATA
  101. 21 .got 00000004 08049ffc 08049ffc 00000ffc 2**2
  102. CONTENTS, ALLOC, LOAD, DATA
  103. 22 .got.plt 00000018 0804a000 0804a000 00001000 2**2
  104. CONTENTS, ALLOC, LOAD, DATA
  105. 23 .data 00000008 0804a018 0804a018 00001018 2**2
  106. CONTENTS, ALLOC, LOAD, DATA
  107. 24 .bss 00000fc0 0804a020 0804a020 00001020 2**5
  108. ALLOC
  109. 25 .comment 00000052 00000000 00000000 00001020 2**0
  110. CONTENTS, READONLY

注意第53行和110行。对于1来说,BSS段本来就没有,只是留一个符号,占用四个字节,对齐方式是2^0 = 1。而对于2来说,BSS段里面有4000字节的大小,加上本来就有的4个字节,就是4004字节大小。但其对齐方式是2^5 = 32,所以4004字节被扩展成4032个字节以满足对齐条件。究其原因为何是32字节对齐,应该跟CPU的缓存机制有关吧。

关于DATA段的讨论,以后再继续,但上述思路应该是没问题的,可以参考借鉴。

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注