我有一个Linux ELF文件a.out,我使用以下命令提取_start的反汇编结果
objdump -d ./a.out -F | awk -v RS= '/^[[:xdigit:]]+ <_start>/'我得到的输出如下
00000000004008e0 <_start> (File Offset: 0x8e0):
4008e0: 31 ed xor %ebp,%ebp
4008e2: 49 89 d1 mov %rdx,%r9
4008e5: 5e pop %rsi
4008e6: 48 89 e2 mov %rsp,%rdx
4008e9: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
4008ed: 50 push %rax
4008ee: 54 push %rsp
4008ef: 49 c7 c0 30 19 40 00 mov $0x401930,%r8
4008f6: 48 c7 c1 a0 18 40 00 mov $0x4018a0,%rcx
4008fd: 48 c7 c7 f0 05 40 00 mov $0x4005f0,%rdi
400904: e8 d7 0a 00 00 callq 4013e0 <__libc_start_main> (File Offset: 0x13e0)
400909: f4 hlt
40090a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)上述结果说明_start占用了0x40090a - 0x4008e0 + 6=48字节。我还使用
hexdump -C -s `echo ""|awk '{printf("%d", 0x8e0)}'` -n 48 ./a.out检查文件内容,如下所示
000008e0 31 ed 49 89 d1 5e 48 89 e2 48 83 e4 f0 50 54 49 |1.I..^H..H...PTI|
000008f0 c7 c0 30 19 40 00 48 c7 c1 a0 18 40 00 48 c7 c7 |..0.@.H....@.H..|
00000900 f0 05 40 00 e8 d7 0a 00 00 f4 66 0f 1f 44 00 00 |..@.......f..D..|
00000910上面的输出与objdump完全相同
然而,让我感到困惑的是,readelf -s报告的不是大小为48的_start,而是42。请参见下面的命令和输出。
readelf -s ./sub1.r.exe | awk '{if (NR==3) print $0; if ("1277:"==$1) print $0 }' Num: Value Size Type Bind Vis Ndx Name
1277: 00000000004008e0 42 FUNC GLOBAL DEFAULT 6 _start为什么readelf没有报告symbol _start的size 48
更新
根据评论,我编写了一个bash程序来检查.text部分中的每个符号。(脚本并不完美,但适用于大多数情况)
while read line; do
symbol=`echo $line | awk '{print $NF}'`
size=`echo $line | awk '{print $3}'`
objdump -d ./sub1.r.exe | awk -v RS= "/^[[:xdigit:]]+ <$symbol>/" > ./aaa.txt
nlines=`cat aaa.txt | wc -l`
[ $nlines -eq 0 ] && continue;
app=$(tail aaa.txt -n 1 | awk -F: '{print $2}' | awk \
'{
for (i=1; i<=NF; i++) {
if (match($i, "\\<[0-9a-f]{2}\\>")){
continue;
}
else{
break;
}
}
print i-1
}')
total=$(cat aaa.txt | awk -v n=$nlines -v a=$app \
'{
if (NR==2){
ns = "0x" substr($1, 0, length($1)-1);
start=strtonum(ns);
}
if (NR==n){
ns = "0x" substr($1, 0, length($1)-1)
stop =strtonum(ns);
}
} END {print stop-start + a}' )
printf "%10d %-10d %4d %s\n" $total $size $((total%16)) $symbol
done < <(readelf -s ./a.out | awk '{if ($7==6 && $3>0) print $0}')尽管许多符号的大小遵守对齐约束。上述脚本的输出并不能证明每个符号都遵守16字节的对齐约束。他们中的一些人并不遵守这个约束。您可以使用gcc -static编译任何源文件,以获得一个ELF文件,并使用上面的脚本进行检查。
更新2
我从objdump -d中提取函数backtrace_and_maps的反汇编输出,如下所示。
0000000000400390 <backtrace_and_maps> (File Offset: 0x390):
400390: ff cf dec %edi
400392: 0f 8e 3a 01 00 00 jle 4004d2 <backtrace_and_maps+0x142> (File Offset: 0x4d2)
400398: 40 84 f6 test %sil,%sil
40039b: 0f 84 31 01 00 00 je 4004d2 <backtrace_and_maps+0x142> (File Offset: 0x4d2)
4003a1: 55 push %rbp
4003a2: 53 push %rbx
4003a3: be 40 00 00 00 mov $0x40,%esi
4003a8: 89 d5 mov %edx,%ebp
4003aa: 48 81 ec 08 06 00 00 sub $0x608,%rsp
4003b1: 48 89 e7 mov %rsp,%rdi
4003b4: e8 97 2c 04 00 callq 443050 <__backtrace> (File Offset: 0x43050)
4003b9: 83 f8 02 cmp $0x2,%eax
4003bc: 41 89 c0 mov %eax,%r8d
4003bf: 0f 8e 04 01 00 00 jle 4004c9 <backtrace_and_maps+0x139> (File Offset: 0x4c9)
4003c5: 48 63 dd movslq %ebp,%rbx
4003c8: ba 1d 00 00 00 mov $0x1d,%edx
4003cd: be 08 26 4a 00 mov $0x4a2608,%esi
4003d2: 48 89 df mov %rbx,%rdi
4003d5: b8 01 00 00 00 mov $0x1,%eax
4003da: 0f 05 syscall
4003dc: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
4003e2: 76 0c jbe 4003f0 <backtrace_and_maps+0x60> (File Offset: 0x3f0)
4003e4: 48 c7 c2 d0 ff ff ff mov $0xffffffffffffffd0,%rdx
4003eb: f7 d8 neg %eax
4003ed: 64 89 02 mov %eax,%fs:(%rdx)
4003f0: 41 8d 70 ff lea -0x1(%r8),%esi
4003f4: 48 8d 7c 24 08 lea 0x8(%rsp),%rdi
4003f9: 89 ea mov %ebp,%edx
4003fb: e8 b0 2c 04 00 callq 4430b0 <__backtrace_symbols_fd> (File Offset: 0x430b0)
400400: ba 1d 00 00 00 mov $0x1d,%edx
400405: be 26 26 4a 00 mov $0x4a2626,%esi
40040a: 48 89 df mov %rbx,%rdi
40040d: b8 01 00 00 00 mov $0x1,%eax
400412: 0f 05 syscall
400414: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
40041a: 76 0c jbe 400428 <backtrace_and_maps+0x98> (File Offset: 0x428)
40041c: 48 c7 c2 d0 ff ff ff mov $0xffffffffffffffd0,%rdx
400423: f7 d8 neg %eax
400425: 64 89 02 mov %eax,%fs:(%rdx)
400428: 31 f6 xor %esi,%esi
40042a: bf 44 26 4a 00 mov $0x4a2644,%edi
40042f: b8 02 00 00 00 mov $0x2,%eax
400434: 0f 05 syscall
400436: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
40043c: 76 10 jbe 40044e <backtrace_and_maps+0xbe> (File Offset: 0x44e)
40043e: 48 c7 c2 d0 ff ff ff mov $0xffffffffffffffd0,%rdx
400445: f7 d8 neg %eax
400447: 64 89 02 mov %eax,%fs:(%rdx)
40044a: 48 83 c8 ff or $0xffffffffffffffff,%rax
40044e: 4c 63 c0 movslq %eax,%r8
400451: 31 ed xor %ebp,%ebp
400453: 41 ba 01 00 00 00 mov $0x1,%r10d
400459: ba 00 04 00 00 mov $0x400,%edx
40045e: 48 8d b4 24 00 02 00 lea 0x200(%rsp),%rsi
400465: 00
400466: 4c 89 c7 mov %r8,%rdi
400469: 89 e8 mov %ebp,%eax
40046b: 0f 05 syscall
40046d: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
400473: 49 89 c1 mov %rax,%r9
400476: 76 1a jbe 400492 <backtrace_and_maps+0x102> (File Offset: 0x492)
400478: 48 c7 c0 d0 ff ff ff mov $0xffffffffffffffd0,%rax
40047f: 41 f7 d9 neg %r9d
400482: 64 44 89 08 mov %r9d,%fs:(%rax)
400486: 4c 89 c7 mov %r8,%rdi
400489: b8 03 00 00 00 mov $0x3,%eax
40048e: 0f 05 syscall
400490: eb 37 jmp 4004c9 <backtrace_and_maps+0x139> (File Offset: 0x4c9)
400492: 48 85 c0 test %rax,%rax
400495: 7e ef jle 400486 <backtrace_and_maps+0xf6> (File Offset: 0x486)
400497: 4c 89 ca mov %r9,%rdx
40049a: 48 8d b4 24 00 02 00 lea 0x200(%rsp),%rsi
4004a1: 00
4004a2: 48 89 df mov %rbx,%rdi
4004a5: 44 89 d0 mov %r10d,%eax
4004a8: 0f 05 syscall
4004aa: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
4004b0: 76 10 jbe 4004c2 <backtrace_and_maps+0x132> (File Offset: 0x4c2)
4004b2: 48 c7 c2 d0 ff ff ff mov $0xffffffffffffffd0,%rdx
4004b9: f7 d8 neg %eax
4004bb: 64 89 02 mov %eax,%fs:(%rdx)
4004be: 48 83 c8 ff or $0xffffffffffffffff,%rax
4004c2: 49 39 c1 cmp %rax,%r9
4004c5: 74 92 je 400459 <backtrace_and_maps+0xc9> (File Offset: 0x459)
4004c7: eb bd jmp 400486 <backtrace_and_maps+0xf6> (File Offset: 0x486)
4004c9: 48 81 c4 08 06 00 00 add $0x608,%rsp
4004d0: 5b pop %rbx
4004d1: 5d pop %rbp
4004d2: c3 retq
00000000004004d3 <detach_arena.part.0> (File Offset: 0x4d3):
4004d3: 50 push %rax
4004d4: b9 68 38 4a 00 mov $0x4a3868,%ecx
4004d9: ba 75 02 00 00 mov $0x275,%edx
4004de: be e8 29 4a 00 mov $0x4a29e8,%esi
4004e3: bf c0 2d 4a 00 mov $0x4a2dc0,%edi
4004e8: e8 93 71 01 00 callq 417680 <__malloc_assert> (File Offset: 0x17680)我还将偏移量为0x390的二进制内容提取到elf文件中,长度为0x4004d2 - 0x400390 + 1 + 5 = 328,如下所示。
00000390 ff cf 0f 8e 3a 01 00 00 40 84 f6 0f 84 31 01 00 |....:...@....1..|
000003a0 00 55 53 be 40 00 00 00 89 d5 48 81 ec 08 06 00 |.US.@.....H.....|
000003b0 00 48 89 e7 e8 97 2c 04 00 83 f8 02 41 89 c0 0f |.H....,.....A...|
000003c0 8e 04 01 00 00 48 63 dd ba 1d 00 00 00 be 08 26 |.....Hc........&|
000003d0 4a 00 48 89 df b8 01 00 00 00 0f 05 48 3d 00 f0 |J.H.........H=..|
000003e0 ff ff 76 0c 48 c7 c2 d0 ff ff ff f7 d8 64 89 02 |..v.H........d..|
000003f0 41 8d 70 ff 48 8d 7c 24 08 89 ea e8 b0 2c 04 00 |A.p.H.|$.....,..|
00000400 ba 1d 00 00 00 be 26 26 4a 00 48 89 df b8 01 00 |......&&J.H.....|
00000410 00 00 0f 05 48 3d 00 f0 ff ff 76 0c 48 c7 c2 d0 |....H=....v.H...|
00000420 ff ff ff f7 d8 64 89 02 31 f6 bf 44 26 4a 00 b8 |.....d..1..D&J..|
00000430 02 00 00 00 0f 05 48 3d 00 f0 ff ff 76 10 48 c7 |......H=....v.H.|
00000440 c2 d0 ff ff ff f7 d8 64 89 02 48 83 c8 ff 4c 63 |.......d..H...Lc|
00000450 c0 31 ed 41 ba 01 00 00 00 ba 00 04 00 00 48 8d |.1.A..........H.|
00000460 b4 24 00 02 00 00 4c 89 c7 89 e8 0f 05 48 3d 00 |.$....L......H=.|
00000470 f0 ff ff 49 89 c1 76 1a 48 c7 c0 d0 ff ff ff 41 |...I..v.H......A|
00000480 f7 d9 64 44 89 08 4c 89 c7 b8 03 00 00 00 0f 05 |..dD..L.........|
00000490 eb 37 48 85 c0 7e ef 4c 89 ca 48 8d b4 24 00 02 |.7H..~.L..H..$..|
000004a0 00 00 48 89 df 44 89 d0 0f 05 48 3d 00 f0 ff ff |..H..D....H=....|
000004b0 76 10 48 c7 c2 d0 ff ff ff f7 d8 64 89 02 48 83 |v.H........d..H.|
000004c0 c8 ff 49 39 c1 74 92 eb bd 48 81 c4 08 06 00 00 |..I9.t...H......|
000004d0 5b 5d c3 50 b9 68 38 4a |[].P.h8J|
000004d8我还grep了readelf -s的输出,如下所示
Num: Value Size Type Bind Vis Ndx Name
103: 0000000000400390 323 FUNC LOCAL DEFAULT 6 backtrace_and_maps正如您所看到的,函数backtrace_and_maps确实占用了323个字节,而不是由16位或8位对齐。
发布于 2021-04-10 13:16:04
为什么readelf没有报告symbol _start的大小为48
符号大小(至少对于.text符号)是完全可选的。虽然编译器通常会设置它,但汇编代码通常不会。
当objdump反汇编函数时,它不关心符号大小。它简单地假设一个标签和下一个标签之间的任何东西都构成一个函数。
示例:
// foo.s
foo:
nop; nop; nop
bar:
nop; nop;
.size foo, .-foo
.size bar, .-bar
nop; nop
baz:
nop使用gcc -c foo.s编译。符号大小:
readelf -Ws foo.o
Symbol table '.symtab' contains 6 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 SECTION LOCAL DEFAULT 1
2: 0000000000000000 0 SECTION LOCAL DEFAULT 2
3: 0000000000000000 0 SECTION LOCAL DEFAULT 3
4: 0000000000000000 5 NOTYPE LOCAL DEFAULT 1 foo
5: 0000000000000003 2 NOTYPE LOCAL DEFAULT 1 bar
6: 0000000000000007 0 NOTYPE LOCAL DEFAULT 1 baz从objdump反汇编
objdump -d foo.o
foo.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <foo>:
0: 90 nop
1: 90 nop
2: 90 nop
0000000000000003 <bar>:
3: 90 nop
4: 90 nop
5: 90 nop
6: 90 nop
0000000000000007 <baz>:
7: 90 nop请注意,foo的ELF符号大小大于其objdump输出(因为我们在其中包含了bar的一部分),而bar的ELF符号大小小于objdump输出。baz根本没有大小(我们没有设置它)。
https://stackoverflow.com/questions/67017610
复制相似问题