Thursday, 14 August 2014

FreeNAS - Permanent errors have been detected

I was looking at one of my FreeNAS installations the other day. Unfortunately I found one of my drives with errors and as far as I remember those files were corrupt during the file transfer so I was quite sure the drive is OK - I had it tested with UBCD/Vivard anyway. So here is what I did to clear the error. If you experience those errors constantly consider replacing the drive (see the references).

zpool status -v

pool: ITSoft                                                                  
state: ONLINE                                                                  
status: One or more devices has experienced an error resulting in data          
        corruption.  Applications may be affected.                              
action: Restore the file in question if possible.  Otherwise restore the        
        entire pool from backup.                                                
   see: http://illumos.org/msg/ZFS-8000-8A                                      
  scan: scrub repaired 0 in 0h12m with 2 errors on Sun Jul 27 00:12:58 2014     
config:                                                                         
                                                                                
        NAME                                          STATE     READ WRITE CKSUM
        ITSoft                                        ONLINE       0     0     2
          gptid/6358bf65-f6ea-11e3-9135-080027eb8f88  ONLINE       0     0     4
                                                                                
errors: Permanent errors have been detected in the following files:             
                                                                                
        /mnt/ITSoft/DriverPack_12.3.iso                                 
        /mnt/ITSoft/DRP13-R377-DVD.iso                       

zpool clear ITSoft gptid/6358bf65-f6ea-11e3-9135-080027eb8f88

Here I did the mistake of deleting both of the files. Don't do it. Some explanations why not to do it, before clearing the drive - see the references.

"That error is telling you that inode <0x9f115> is corrupt (deleting the file broke the 

filename->inode mapping, so it's just reporting the inode now). Either something still has the 

file open or the metadata just needs to be cleaned up (which a scrub should do)."

"Normally, the path and name of the file with the error would be displayed, the <0x80> suggests 

the corrupted file was deleted (before or after the error). It is a reference to the indirect 

block that contained the file. This suggests that no active files were corrupted, just a file 


that was already deleted, or has since been deleted."

 pool: ITSoft                                                                  
 state: ONLINE                                                                  
status: One or more devices has experienced an error resulting in data          
        corruption.  Applications may be affected.                              
action: Restore the file in question if possible.  Otherwise restore the        
        entire pool from backup.                                                
   see: http://illumos.org/msg/ZFS-8000-8A                                      
  scan: scrub repaired 0 in 0h12m with 2 errors on Sun Jul 27 00:12:58 2014     
config:                                                                         
                                                                                
        NAME                                          STATE     READ WRITE CKSUM
        ITSoft                                        ONLINE       0     0     2
          gptid/6358bf65-f6ea-11e3-9135-080027eb8f88  ONLINE       0     0     0
                                                                                
errors: Permanent errors have been detected in the following files:             
                                                                                
        ITSoft:<0xab>                                                           
        ITSoft:<0xae>              

So once the damage has been done I had to repair it.
zpool scrub ITSoft

This will take some time so check the status zpool status -v to see the progress. 

pool: ITSoft                                                                  
 state: ONLINE                                                                  
status: One or more devices has experienced an unrecoverable error.  An         
        attempt was made to correct the error.  Applications are unaffected.    
action: Determine if the device needs to be replaced, and clear the errors      
        using 'zpool clear' or replace the device with 'zpool replace'.         
   see: http://illumos.org/msg/ZFS-8000-9P                                      
  scan: scrub repaired 0 in 0h12m with 0 errors on Thu Aug 14 00:30:25 2014     
config:                                                                         
                                                                                
        NAME                                          STATE     READ WRITE CKSUM
        ITSoft                                        ONLINE       0     0     0
          gptid/6358bf65-f6ea-11e3-9135-080027eb8f88  ONLINE       0     0    89
                                                                                
errors: No known data errors     

And just in case another clear:
zpool clear -F ITSoft

End result:


 pool: ITSoft                                                                  
 state: ONLINE                                                                  
  scan: scrub repaired 0 in 0h12m with 0 errors on Thu Aug 14 00:30:25 2014     
config:                                                                         
                                                                                
        NAME                                          STATE     READ WRITE CKSUM
        ITSoft                                        ONLINE       0     0     0
          gptid/6358bf65-f6ea-11e3-9135-080027eb8f88  ONLINE       0     0     0
                                                                                
errors: No known data errors    

References:
http://serverfault.com/questions/581351/zfs-checksum-errors-what-files-are-affected
http://serverfault.com/questions/576898/clear-a-permanent-zfs-error-in-a-healthy-pool
http://www.retrospekt.dk/2011/08/zfs-drive-replacement/