pts-elf-binary-patching-nasm-tutorial.txt by pts@fazekas.hu at Tue Nov 1 14:54:27 CET 2005 $Id: pts-elf-binary-patching-nasm-tutorial.txt,v 1.2 2005/11/01 20:35:38 pts Exp $ pts' tutorial for simple i386 ELF binary patching using NASM ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This tutorial is for those programmers who would like to make small changes to i386 ELF binaries which they don't have the source code available for, or it is not feasible to recompile the whole source code. This tutorial is work in progress. A general outline of making changes to a binary: 1. Getting to know about a problem with a program. 2. Finding the binary which has to be changed. 3. Decompressing the binary if it has been compressed (i.e. UPX -d). 4. Disassembling (maybe decompiling?) the binary. 5. Studying whether it is possible at all to make the change. I.e. finding out if the program really gets the information that needs to be processed. If not, then the program which does should be found 1st. 6. Reverse engineering: finding where and what to change in the binary. Can be done using the disassembled or the decompiled source, a debugger, `strace -i' etc. This is a tough part, which might take unplannably long time, and it is not covered in detail in this tutorial. 7. Developing the replacement code. In this tutorial we will do it in assembler (NASM syntax), but for many changes it might be feasible to write it in C -- this is not covered in the tutorial. 8. Compiling the replacement code. We will be running NASM to do the job. 9. Patching the binary, i.e. integrating the replacement code. This is the main focus of this tutorial. 10. Testing the new binary. Does it start up? Does it work? Do the changes have effect? 11. Continuing from step 5 if the result is not satisfactory. This tutorial presents a workflow in which the ``Patching the binary'' step is totally automatic, provided that all previous steps are done, and the replacement code has been developed in NASM syntax. The following tools will be used: -- UPX (Ultipate Packer for eXecutables) 1.93 beta an executable packer with good ratios get the sources from http://upx.sourceforge.net/ get the Linux binary (upx.unstable) from http://www.math.bme.hu/~pts/cvsget.cgi/u=bin.i386/p=/M=bin.i386/c=f1/n=/bin.i386/pts-elfdisasm.static -- pts-elfdisasm 0.14 an i386 ELF disassembler modified by the author of this tutorial get the sources from http://www.inf.bme.hu/~pts/pts-elfdisasm-latest.tar.gz get the Linux binary (pts-elfdisasm.static) from http://www.math.bme.hu/~pts/cvsget.cgi/u=bin.i386/p=/M=bin.i386/c=f1/n=/bin.i386/pts-elfdisasm.static -- NASM (Netwide Assembler) 0.98.38 i386 assembler for many targets, including flat binary -- and a disassembler get the sources from http://sourceforge.net/projects/nasm get the Linux binary (nasm.static) from http://www.math.bme.hu/~pts/cvsget.cgi/u=bin.i386/p=/M=bin.i386/c=f1/n=/bin.i386/nasm.static get the Linux binary (ndisasm.static) from http://www.math.bme.hu/~pts/cvsget.cgi/u=bin.i386/p=/M=bin.i386/c=f1/n=/bin.i386/ndisasm.static -- patchx.pl a simple Perl script to apply binary (hexadecimal) patches get it from http://www.math.bme.hu/~pts/cvsget.cgi/u=bin/p=/M=bin/c=f1/n=/bin/patchx.pl written by the author of this tutorial -- nasmpatch.sh a simple shell script to create binary patches with NASM get it from http://www.math.bme.hu/~pts/cvsget.cgi/u=bin/p=/M=bin/c=f1/n=/bin/patchx.pl written by the author of this tutorial -- objdump and objcopy part of GNU binutils -- Perl a general purpose, powerful scripting language Perl can be used to find patterns in the disassembled output -- joe a simple but stable text-editor -- bvi a hex-editor for binary files with the vi keybindings -- Midnight Commander: mc mcedit mcview a directory browser / file manager Accompanying files ~~~~~~~~~~~~~~~~~~ -- double.c (we have to pretend we don't have this!) -- double_debug (18183 bytes): ELF executable compiled with GCC 2.95, as `gcc -g -o double_debug double.c'. -- double_normal (5607 bytes): ELF executable compile with GCC 2.95, as `gcc -O2 -o double_normal double.c' -- double_stripped (3640 bytes): ELF executable compiled with GCC 2.95, as `gcc -O2 -s -o double_stripped double.c' -- sqrt.c (a sample implementation of the unsigned sqrt algorithm) -- double_static (13036 bytes): statically linked ELF executable compiled with GCC 2.95, as `i386-uclibc-gcc -s -static -O2 -o double_static double.c'. The example task ~~~~~~~~~~~~~~~~ Let's suppose we have a program that prints the integers along with their doubles in a given range: $ ./double_stripped This program prints the doubles of integers. Usage: ./double_stripped $ ./double_stripped 20 10 This program prints the doubles of integers. Usage: ./double_stripped $ ./double_stripped 3 10 The double of 3 is 6. The double of 4 is 8. The double of 5 is 10. The double of 6 is 12. The double of 7 is 14. The double of 8 is 16. The double of 9 is 18. The double of 10 is 20. Our task will be to change the program `double_stripped' to `our_sqrt' so it will print square roots. A test case: $ ./our_sqrt This program prints the square roots of nonnegative integers. Usage: ./our_sqrt $ ./our_sqrt 20 10 This program prints the square roots of nonnegative integers. Usage: ./our_sqrt $ ./our_sqrt -3 10 This program prints the square roots of nonnegative integers. Usage: ./our_sqrt $ ./our_sqrt 3 10 The square root of 3 rounded down is 1. The square root of 4 rounded down is 2. The square root of 5 rounded down is 2. The square root of 6 rounded down is 2. The square root of 7 rounded down is 2. The square root of 8 rounded down is 2. The square root of 9 rounded down is 3. The square root of 10 rounded down is 3. This is clearly a toy task since it is easier to rewrite the whole program from scratch. But let's suppose that the original `double_stripped' binary, which is in our possession, has a lot of other functionality, which would be hard to reimplement, and the only thing we have to change is the doubling to square roots. Bug finding ~~~~~~~~~~~ This is totally off-topic, but here are some program analysis challenge. Look at the source of double program in the accompanying double.c, and the square root program in the accompanying sqrt.c . Try to do these exercises: -- Prove that the sqrt_down() algorithm is correct. -- Does the `a<=b' comparison always terminate the loop? What if b is maximmal (i.e. `b == MAX_LONG')? Change the loop condition so it always works. Spoiler: there is no good loop condition, whic works even when `a == MIN_LONG && b==MAX_LONG' (and thus `a == b+1') initially. An `if (a==b) break;' should be inserted at the end of the loop body. -- Change sqrt_max() so it continues working even if `n > MAX_LONG'. -- Why is `%n' needed in sscanf()? The exercises show that even short, and simple-looking programs can be non-trivial if we try to handle all possible cases. What to change ~~~~~~~~~~~~~~ Getting back to our binary changing task, here is a summary of what we have to change in the `double_stripped' binary: -- (C1) The help message (``double'' -> ``square root'' etc.). -- (C2) The message for each line (``double'' -> ``square root'' etc.). -- (C3) Add a non-negativity check after parsing the arguments. -- (C4) Change the loop body so not the double, but the square root gets computed (and printed). Changing the strings (but not their size) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Use pts-elfdisasm.static to disassemble the `double_stripped' binary: $ pts-elfdisasm.static -d -s double_stripped Its output: filename=(double_stripped) ELF.type=2 entry_point=0x08048400 Sections: Nr. FileOffs MemAddr Size____ Info____ Link Type Name 0. 00000000 00000000 0 0 0 -- '' 1. 000000F4 080480F4 13 0 0 data '.interp' 2. 00000108 08048108 20 0 0 :note '.note.ABI-tag' 3. 00000128 08048128 44 0 4 :hash '.hash' 4. 0000016C 0804816C C0 1 5 :dynsym '.dynsym' 5. 0000022C 0804822C 97 0 0 strtab '.dynstr' 6. 000002C4 080482C4 18 0 4 versym '.gnu.version' 7. 000002DC 080482DC 20 1 5 verneed '.gnu.version_r' 8. 000002FC 080482FC 18 0 4 rel '.rel.dyn' 9. 00000314 08048314 38 B 4 rel '.rel.plt' 10. 0000034C 0804834C 25 0 0 data '.init' 11. 00000374 08048374 80 0 0 data '.plt' 12. 00000400 08048400 200 0 0 data '.text' 13. 00000600 08048600 1C 0 0 data '.fini' 14. 00000620 08048620 C0 0 0 data '.rodata' 15. 000006E0 080496E0 10 0 0 data '.data' 16. 000006F0 080496F0 4 0 0 data '.eh_frame' 17. 000006F4 080496F4 C8 0 5 :dynamic '.dynamic' 18. 000007BC 080497BC 8 0 0 data '.ctors' 19. 000007C4 080497C4 8 0 0 data '.dtors' 20. 000007CC 080497CC 2C 0 0 data '.got' 21. 000007F8 080497F8 20 0 0 :bss '.bss' 22. 000007F8 00000000 120 0 0 data '.comment' 23. 00000918 00000000 78 0 0 :note '.note' 24. 00000990 00000000 BF 0 0 strtab '.shstrtab' Reading STATIC symbols... no such symbol table! Reading DYNAMIC symbols... sym - '' =0x00000000+0x0 section=0 type=0 bind=0 other=0 sym + '__register_frame_info' =0x08048384+0x27 section=0 type=2 bind=2 other=0 sym + 'fprintf' =0x08048394+0x2a section=0 type=2 bind=1 other=0 sym + 'fflush' =0x080483A4+0xbf section=0 type=2 bind=1 other=0 sym + '__deregister_frame_info' =0x080483B4+0x23 section=0 type=2 bind=2 other=0 sym + 'stdout' =0x080497F8+0x4 section=21 type=1 bind=1 other=0 sym + 'stderr' =0x080497FC+0x4 section=21 type=1 bind=1 other=0 sym + '__libc_start_main' =0x080483C4+0xc1 section=0 type=2 bind=1 other=0 sym + 'printf' =0x080483D4+0x2f section=0 type=2 bind=1 other=0 sym + 'sscanf' =0x080483E4+0x2a section=0 type=2 bind=1 other=0 sym + '_IO_stdin_used' =0x08048624+0x4 section=14 type=1 bind=1 other=0 sym + '__gmon_start__' =0x00000000+0x0 section=0 type=0 bind=2 other=0 11 symbols read Reading .rodata section (strings)... section .plt: 00000374 08048374 FF35D0970408 push dword [0x80497d0] ;LT 0000037A 0804837A FF25D4970408 jmp near [0x80497d4] 00000380 08048380 0000 add [eax],al 00000382 08048382 0000 add [eax],al 00000384 __register_frame_info: 08048384 FF25D8970408 jmp near [0x80497d8] 0000038A 0804838A 6800000000 push dword 0x0 0000038F 0804838F E9E0FFFFFF jmp 0X08048374 ;JS 00000394 fprintf: 08048394 FF25DC970408 jmp near [0x80497dc] 0000039A 0804839A 6808000000 push dword 0x8 0000039F 0804839F E9D0FFFFFF jmp 0X08048374 ;JS 000003A4 fflush: 080483A4 FF25E0970408 jmp near [0x80497e0] 000003AA 080483AA 6810000000 push dword 0x10 000003AF 080483AF E9C0FFFFFF jmp 0X08048374 ;JS 000003B4 __deregister_frame_info: 080483B4 FF25E4970408 jmp near [0x80497e4] 000003BA 080483BA 6818000000 push dword 0x18 000003BF 080483BF E9B0FFFFFF jmp 0X08048374 ;JS 000003C4 __libc_start_main: 080483C4 FF25E8970408 jmp near [0x80497e8] 000003CA 080483CA 6820000000 push dword 0x20 000003CF 080483CF E9A0FFFFFF jmp 0X08048374 ;JS 000003D4 printf: 080483D4 FF25EC970408 jmp near [0x80497ec] 000003DA 080483DA 6828000000 push dword 0x28 000003DF 080483DF E990FFFFFF jmp 0X08048374 ;JS 000003E4 sscanf: 080483E4 FF25F0970408 jmp near [0x80497f0] 000003EA 080483EA 6830000000 push dword 0x30 000003EF 080483EF E980FFFFFF jmp 0X08048374 ;JS section .text: 00000400 08048400 31ED xor ebp,ebp 00000402 08048402 5E pop esi 00000403 08048403 89E1 mov ecx,esp 00000405 08048405 83E4F0 and esp,byte -0x10 00000408 08048408 50 push eax 00000409 08048409 54 push esp 0000040A 0804840A 52 push edx 0000040B 0804840B 6800860408 push dword 0x8048600 00000410 08048410 684C830408 push dword 0x804834c 00000415 08048415 51 push ecx 00000416 08048416 56 push esi 00000417 08048417 68E0840408 push dword 0x80484e0 0000041C 0804841C E8A3FFFFFF call 0X080483C4 ;CS 00000421 08048421 F4 hlt 00000422 08048422 89F6 mov esi,esi 00000424 08048424 55 push ebp 00000425 08048425 89E5 mov ebp,esp 00000427 08048427 83EC14 sub esp,byte +0x14 0000042A 0804842A 53 push ebx 0000042B 0804842B E800000000 call 0X08048430 ;CS 00000430 08048430 5B pop ebx ;PROC 00000431 08048431 81C39C130000 add ebx,0x139c 00000437 08048437 8B8328000000 mov eax,[ebx+0x28] 0000043D 0804843D 85C0 test eax,eax 0000043F 0804843F 7402 jz 0X08048443 ;JS 00000441 08048441 FFD0 call eax 00000443 08048443 5B pop ebx ;JT 00000444 08048444 C9 leave 00000445 08048445 C3 ret 00000446 08048446 89F6 mov esi,esi 00000448 08048448 90 nop 00000449 08048449 90 nop 0000044A 0804844A 90 nop 0000044B 0804844B 90 nop 0000044C 0804844C 90 nop 0000044D 0804844D 90 nop 0000044E 0804844E 90 nop 0000044F 0804844F 90 nop 00000450 08048450 55 push ebp 00000451 08048451 89E5 mov ebp,esp 00000453 08048453 83EC08 sub esp,byte +0x8 00000456 08048456 833DEC96040800 cmp dword [0x80496ec],byte +0x0 0000045D 0804845D 753E jnz 0X0804849D ;JS 0000045F 0804845F EB12 jmp short 0X08048473 ;JS 00000461 08048461 A1E8960408 mov eax,[0x80496e8] ;JT 00000466 08048466 8D5004 lea edx,[eax+0x4] 00000469 08048469 8915E8960408 mov [0x80496e8],edx 0000046F 0804846F 8B00 mov eax,[eax] 00000471 08048471 FFD0 call eax 00000473 08048473 A1E8960408 mov eax,[0x80496e8] ;JT 00000478 08048478 833800 cmp dword [eax],byte +0x0 0000047B 0804847B 75E4 jnz 0X08048461 ;JS 0000047D 0804847D B8B4830408 mov eax,0x80483b4 ^-- 0x080483B4 = __deregister_frame_info 00000482 08048482 85C0 test eax,eax 00000484 08048484 740D jz 0X08048493 ;JS 00000486 08048486 83C4F4 add esp,byte -0xc 00000489 08048489 68F0960408 push dword 0x80496f0 0000048E 0804848E E821FFFFFF call 0X080483B4 ;CS 00000493 08048493 C705EC9604080100 mov dword [0x80496ec],0x1 ;JT -0000 0000049D 0804849D C9 leave ;JT 0000049E 0804849E C3 ret 0000049F 0804849F 90 nop 000004A0 080484A0 55 push ebp 000004A1 080484A1 89E5 mov ebp,esp 000004A3 080484A3 83EC08 sub esp,byte +0x8 000004A6 080484A6 C9 leave 000004A7 080484A7 C3 ret 000004A8 080484A8 55 push ebp 000004A9 080484A9 89E5 mov ebp,esp 000004AB 080484AB 83EC08 sub esp,byte +0x8 000004AE 080484AE B884830408 mov eax,0x8048384 ^-- 0x08048384 = __register_frame_info 000004B3 080484B3 85C0 test eax,eax 000004B5 080484B5 7412 jz 0X080484C9 ;JS 000004B7 080484B7 83C4F8 add esp,byte -0x8 000004BA 080484BA 6800980408 push dword 0x8049800 000004BF 080484BF 68F0960408 push dword 0x80496f0 000004C4 080484C4 E8BBFEFFFF call 0X08048384 ;CS 000004C9 080484C9 C9 leave ;JT 000004CA 080484CA C3 ret 000004CB 080484CB 90 nop 000004CC 080484CC 55 push ebp 000004CD 080484CD 89E5 mov ebp,esp 000004CF 080484CF 83EC08 sub esp,byte +0x8 000004D2 080484D2 C9 leave 000004D3 080484D3 C3 ret 000004D4 080484D4 8DB600000000 lea esi,[esi+0x0] 000004DA 080484DA 8DBF00000000 lea edi,[edi+0x0] 000004E0 080484E0 55 push ebp 000004E1 080484E1 89E5 mov ebp,esp 000004E3 080484E3 83EC20 sub esp,byte +0x20 000004E6 080484E6 56 push esi 000004E7 080484E7 53 push ebx 000004E8 080484E8 8B5D0C mov ebx,[ebp+0xc] 000004EB 080484EB 8B5304 mov edx,[ebx+0x4] 000004EE 080484EE 85D2 test edx,edx 000004F0 080484F0 745F jz 0X08048551 ;JS 000004F2 080484F2 837B0800 cmp dword [ebx+0x8],byte +0x0 000004F6 080484F6 7459 jz 0X08048551 ;JS 000004F8 080484F8 837B0C00 cmp dword [ebx+0xc],byte +0x0 000004FC 080484FC 7553 jnz 0X08048551 ;JS 000004FE 080484FE 8D75FC lea esi,[ebp-0x4] 00000501 08048501 56 push esi 00000502 08048502 8D45F8 lea eax,[ebp-0x8] 00000505 08048505 50 push eax 00000506 08048506 6840860408 push dword 0x8048640 ^-- 0x08048640 = "%li%n" 0000050B 0804850B 52 push edx 0000050C 0804850C E8D3FEFFFF call 0X080483E4 ;CS 00000511 08048511 83C410 add esp,byte +0x10 00000514 08048514 85C0 test eax,eax 00000516 08048516 7E39 jng 0X08048551 ;JS 00000518 08048518 8B5304 mov edx,[ebx+0x4] 0000051B 0804851B 8B45FC mov eax,[ebp-0x4] 0000051E 0804851E 803C1000 cmp byte [eax+edx],0x0 00000522 08048522 752D jnz 0X08048551 ;JS 00000524 08048524 56 push esi 00000525 08048525 8D45F4 lea eax,[ebp-0xc] 00000528 08048528 50 push eax 00000529 08048529 6840860408 push dword 0x8048640 ^-- 0x08048640 = "%li%n" 0000052E 0804852E FF7308 push dword [ebx+0x8] 00000531 08048531 E8AEFEFFFF call 0X080483E4 ;CS 00000536 08048536 83C410 add esp,byte +0x10 00000539 08048539 85C0 test eax,eax 0000053B 0804853B 7E14 jng 0X08048551 ;JS 0000053D 0804853D 8B5308 mov edx,[ebx+0x8] 00000540 08048540 8B45FC mov eax,[ebp-0x4] 00000543 08048543 803C1000 cmp byte [eax+edx],0x0 00000547 08048547 7508 jnz 0X08048551 ;JS 00000549 08048549 8B45F4 mov eax,[ebp-0xc] 0000054C 0804854C 3945F8 cmp [ebp-0x8],eax 0000054F 0804854F 7E2F jng 0X08048580 ;JS 00000551 08048551 83C4F8 add esp,byte -0x8 ;JT 00000554 08048554 6860860408 push dword 0x8048660 ^-- 0x08048660 = "This program prints the doubles of integers.\n" 00000559 08048559 FF35FC970408 push dword [0x80497fc] ^-- 0x080497FC = stderr 0000055F 0804855F E830FEFFFF call 0X08048394 ;CS 00000564 08048564 83C4FC add esp,byte -0x4 00000567 08048567 FF33 push dword [ebx] 00000569 08048569 688E860408 push dword 0x804868e ^-- 0x0804868E = "Usage: %s \n" 0000056E 0804856E FF35FC970408 push dword [0x80497fc] ^-- 0x080497FC = stderr 00000574 08048574 E81BFEFFFF call 0X08048394 ;CS 00000579 08048579 B802000000 mov eax,0x2 0000057E 0804857E EB41 jmp short 0X080485C1 ;JS 00000580 08048580 83C4FC add esp,byte -0x4 ;JT 00000583 08048583 8B55F8 mov edx,[ebp-0x8] 00000586 08048586 8D0412 lea eax,[edx+edx] 00000589 08048589 50 push eax 0000058A 0804858A 52 push edx 0000058B 0804858B 68AC860408 push dword 0x80486ac ^-- 0x080486AC = "The double of %ld is %ld.\n" 00000590 08048590 E83FFEFFFF call 0X080483D4 ;CS 00000595 08048595 83C410 add esp,byte +0x10 00000598 08048598 F645F87F test byte [ebp-0x8],0x7f 0000059C 0804859C 7511 jnz 0X080485AF ;JS 0000059E 0804859E 83C4F4 add esp,byte -0xc 000005A1 080485A1 FF35F8970408 push dword [0x80497f8] ^-- 0x080497F8 = stdout 000005A7 080485A7 E8F8FDFFFF call 0X080483A4 ;CS 000005AC 080485AC 83C410 add esp,byte +0x10 000005AF 080485AF 8B45F8 mov eax,[ebp-0x8] ;JT 000005B2 080485B2 8D5001 lea edx,[eax+0x1] 000005B5 080485B5 8955F8 mov [ebp-0x8],edx 000005B8 080485B8 89D0 mov eax,edx 000005BA 080485BA 3B45F4 cmp eax,[ebp-0xc] 000005BD 080485BD 7EC1 jng 0X08048580 ;JS 000005BF 080485BF 31C0 xor eax,eax 000005C1 080485C1 8D65D8 lea esp,[ebp-0x28] ;JT 000005C4 080485C4 5B pop ebx 000005C5 080485C5 5E pop esi 000005C6 080485C6 C9 leave 000005C7 080485C7 C3 ret 000005C8 080485C8 90 nop 000005C9 080485C9 8DB42600000000 lea esi,[esi+0x0] 000005D0 080485D0 55 push ebp 000005D1 080485D1 89E5 mov ebp,esp 000005D3 080485D3 83EC14 sub esp,byte +0x14 000005D6 080485D6 53 push ebx 000005D7 080485D7 BBBC970408 mov ebx,0x80497bc 000005DC 080485DC 833DBC970408FF cmp dword [0x80497bc],byte -0x1 000005E3 080485E3 740C jz 0X080485F1 ;JS 000005E5 080485E5 8B03 mov eax,[ebx] ;JT 000005E7 080485E7 FFD0 call eax 000005E9 080485E9 83C3FC add ebx,byte -0x4 000005EC 080485EC 833BFF cmp dword [ebx],byte -0x1 000005EF 080485EF 75F4 jnz 0X080485E5 ;JS 000005F1 080485F1 5B pop ebx ;JT 000005F2 080485F2 C9 leave 000005F3 080485F3 C3 ret 000005F4 080485F4 55 push ebp 000005F5 080485F5 89E5 mov ebp,esp 000005F7 080485F7 83EC08 sub esp,byte +0x8 000005FA 080485FA C9 leave 000005FB 080485FB C3 ret 000005FC 080485FC 8D742600 lea esi,[esi+0x0] In the dissassembly (the output of pts-elfdisasm.static) we see that the string "This program prints the doubles of integers.\n" is located at virtual memory address 0x08048660. Looking at the section headers in the beginning of disassably, we see that our string is in the section .rodata, and we can get its file offset of the string from table: 0x08048660 - 0x08048620 + 0x620 == 0x660. We can use the joe(1) editor to change the string (but not its length!) there: $ cp double_stripped prg $ joe --asis --wordwrap --autoindent --crlf -overwrite prg,0x660,1000 # change some bytes # press Ctrl- to save and exit # or press Ctrl- and confirm to exit without saving # or press Ctrl- to save # if necessary, press to change overwrite mode to insert mode (but later make sure that the overall size of the string doesn't change) # press to insert any byte by code # press to jump to a decimal offset (SUXX: joe cannot jump to hex offset) # press Ctrl- to display the current offset and current character code at the bottom of the screen SUXX: joe(1) accepts hex offset only if doesn't contain [A-Fa-f]. The 1000 in the command line is the number of bytes to load at offset 0x660. Here is how to convert 32-bit numbers between dec and hex: $ perl -le "print 0xff" 255 $ perl -le 'printf"%x\n", 255' ff Or, alternatively you can use the hex editor of Midnight Commander to make the changes: $ cp double_stripped prg $ mcview prg # press to toggle hex mode # press to toggle editing # press again or to change to editing text (not the hexdump) # use the arrow keys to move around # press , type 0x660, press to jump to the desired offset # make sure the editing cursor is in the right side (``EdHex'' should be displayed in the bottom line) # type the replacement text # press to save and continue editing # press , confirm save if needed, and let it exit The disadvantage of mcview is that it tries to be smart when opening files, which might result in running some filter on the file, so it might take a long time to open a large (multi-gigabyte) binary file in mcview -- and then just disable filtering. Also mcedit cannot insert bytes. (mcedit can insert bytes, but it is not binary safe. joe can insert bytes in a binary safe way, but the whole file has to be opened, without specifying offsets). It is recommended to have multiple terminal windows open (for example, several xterms in the 6x13 fixed font), and have a joe or mcedit running in one of them for a long time, and never exiting from it. All shell commands should be issued from another xterm, dedicated for that purpose. This tutorial should be read in a third xterm. This kind of allocation of terminal windows boosts up work speed. Make sure you don't change the length of the string (and thus the length of the file), because that would change the offset of all subsequent strings, and the executable has pointers to all strings, and thus those pointers would now point to the wrong location. Changing the string (including its length) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For simplicity, we don't change the original string, but we add a new string to the binary. This is a typical segment (== section) layout of an ELF binary: -- .text: contains most of the executable program code (it is dissassembled by pts-elfdisasm) -- .plt, .init, .fini, .ctors, .dtors: contains some auxilary program code (only .plt is disassembled by pts-elfdisasm) -- .data: initialized, read-write data, such as `int x=5;' outside a function -- .rodata: initialized, read-only data, such as strings or `const int y=5;' outside a function -- .bss: uninitialized, read-write data, such as `int z;' outside a function. This is not in the ELF file, only virtual memory is reserved for it. So we are going to add a new string to the .text section. Looking at the disassembly, we see that .text is 0x200 bytes long, at virtual memory address 0x08048400, and file position 0x400. To add a new string, we have to: 1. Insert the string to the file at offset 0x400. Let S be the length of the string with a terminating '\0', rounded up the the nearest multiple of 16 (too much, 4 would be enough). This is easy with joe. 2. Change the section header table so .text is now S bytes longer. This can be done in 5 minutes. 3. Move all sections physically behind .text by S bytes. This is tougher, needs programming. 4. Change all code sections so all addresses above 0x08048400 are increased by S. This is even tougher. 5. Change the code to point to our new string at 0x08048400 instead the original string. This is easy. To be continued... --- SUXX: bastard installation # at Tue Nov 1 21:27:07 CET 2005 # SUXX: make install compiles twice CVS nightly tarball CVSROOT cd bastard/src/formats/MAGIC touch cfg.h make cp libMAGIC.so /usr/local/lib/bastard/formats cp MAGIC.{format,dat} /usr/local/lib/bastard/formats rm -rf ~/.bastard/.prg.bdb/ rm -rf ~/.bastard/prg.bdb/ # ^^^ without: /home/guests/pts/.bastard/prg.bdb already exists Error 4501: Previously saved .bdb exists for target in $DB_PATH $ bastard load prg # SUXX: the `string' command doesn't work... --- !! where does the section header table start in the file? (pts-elfdisasm) !! why lea esi,... !! new code in C, relocations, linkage !! multiple patches, don't count the length total