背景介绍:
今天,在运维群里面跟一些群友聊天,我主要吐槽了关于EXT4文件系统难以做数据恢复的问题,因为过去自己有尝试过EXT3和EXT4上的数据恢复,主要用的是ext3grep和ext4undelete,EXT3基本上每次都能恢复成功,但EXT4却没有成功过一次,每次恢复出来的文件都是破损的,或残缺不全的。
期间,一些比较资深的牛人说EXT4他们成功的恢复过,并且说抱怨Linux文件系统数据不好恢复的人都是对Linux文件系统的基础知识不熟悉,多去看看Linux文件系统关于block, inode, superblock的知识,对于删除文件就是另外的认识了。
听到之后很是惭愧,的确过去只是停留在工具层面,没有深入了解这些方面的知识。
然后,我提到了一个场景,就是某个进程在运行过程中一直在打印一个日志,这时候,有人误删了这个日志。群友说这种情况下恢复文件是非常简单的,因为文件其实还存在于系统当中,不过必须要让进程保持在运行状态,一旦停止或重启后,文件就消失了。
相信大家都有过类似的经验,在清理空间的时候,虽然删掉了一些大的日志文件,但是空间并没有得到释放,而是必须要等到重启服务或杀掉进程的时候才会。
于是我简单的搜索了一下,找到了这篇文章:http://unix.stackexchange.com/questions/101237/how-to-recover-files-i-deleted-now-by-running-rm
并且通过ping命令打印日志并删除日志来成功模拟了这样一个场景。
具体步骤如下:
[dong@idc1-dong1 ~]$ ping 111cn.net &> ping.output.log &
[1] 22672
[dong@idc1-dong1 ~]$ tail -n 5 ping.output.log
64 bytes from 54.238.131.140: icmp_seq=14 ttl=47 time=176 ms
64 bytes from 54.238.131.140: icmp_seq=15 ttl=47 time=126 ms
64 bytes from 54.238.131.140: icmp_seq=16 ttl=47 time=205 ms
64 bytes from 54.238.131.140: icmp_seq=17 ttl=47 time=121 ms
64 bytes from 54.238.131.140: icmp_seq=18 ttl=47 time=121 ms
[dong@idc1-dong1 ~]$ rm -f ping.output.log
[dong@idc1-dong1 ~]$ ls ping.output.log
ls: cannot access ping.output.log: No such file or directory
[dong@idc1-dong1 ~]$ sudo lsof | grep ping.output
ping 22672 dong 1w REG 253,0 2666 2016 /home/dong/ping.output.log (deleted)
ping 22672 dong 2w REG 253,0 2666 2016 /home/dong/ping.output.log (deleted)
[dong@idc1-dong1 ~]$ sudo -i
[root@idc1-dong1 ~]# cd /proc/22672/fd
[root@idc1-dong1 fd]# ll
total 0
lrwx------ 1 root root 64 Sep 1 11:23 0 -> /dev/pts/0
l-wx------ 1 root root 64 Sep 1 11:23 1 -> /home/dong/ping.output.log (deleted)
l-wx------ 1 root root 64 Sep 1 11:23 2 -> /home/dong/ping.output.log (deleted)
lrwx------ 1 root root 64 Sep 1 11:23 3 -> socket:[26968949]
[root@idc1-dong1 fd]# tail -n 5 1
64 bytes from 54.238.131.140: icmp_seq=119 ttl=47 time=161 ms
64 bytes from 54.238.131.140: icmp_seq=120 ttl=47 time=125 ms
64 bytes from 54.238.131.140: icmp_seq=121 ttl=47 time=198 ms
64 bytes from 54.238.131.140: icmp_seq=122 ttl=47 time=151 ms
64 bytes from 54.238.131.140: icmp_seq=123 ttl=47 time=135 ms
[root@idc1-dong1 fd]# tail -n 5 2
64 bytes from 54.238.131.140: icmp_seq=121 ttl=47 time=198 ms
64 bytes from 54.238.131.140: icmp_seq=122 ttl=47 time=151 ms
64 bytes from 54.238.131.140: icmp_seq=123 ttl=47 time=135 ms
64 bytes from 54.238.131.140: icmp_seq=124 ttl=47 time=135 ms
64 bytes from 54.238.131.140: icmp_seq=125 ttl=47 time=134 ms
[root@idc1-dong1 fd]# cp 1 /root/ping.output.log.recover
[root@idc1-dong1 fd]# cd
[root@idc1-dong1 ~]# head -n 5 ping.output.log.recover
PING 111cn.net (54.238.131.140) 56(84) bytes of data.
64 bytes from 54.238.131.140: icmp_seq=1 ttl=47 time=227 ms
64 bytes from 54.238.131.140: icmp_seq=2 ttl=47 time=196 ms
64 bytes from 54.238.131.140: icmp_seq=3 ttl=47 time=157 ms
64 bytes from 54.238.131.140: icmp_seq=4 ttl=47 time=235 ms
[root@idc1-dong1 ~]# tail -n 5 ping.output.log.recover
64 bytes from 54.238.131.140: icmp_seq=146 ttl=47 time=172 ms
64 bytes from 54.238.131.140: icmp_seq=147 ttl=47 time=132 ms
64 bytes from 54.238.131.140: icmp_seq=148 ttl=47 time=212 ms
64 bytes from 54.238.131.140: icmp_seq=149 ttl=47 time=172 ms
64 bytes from 54.238.131.140: icmp_seq=150 ttl=47 time=132 ms
[root@idc1-dong1 ~]# pkill -kill -f ping
[root@idc1-dong1 ~]# cd /proc/22672/fd
-bash: cd: /proc/22672/fd: No such file or directory