Does rsync verify files copied between two local drives?Does rsync over any type of checksum?How do you empty the buffers and cache on a Linux system?Using rsync to move (not copy) files between directories?Reasons for rsync NOT transferring all files?Make rsync move (not copy) files on local file systemTo Rsync files where permission deniedDoes rsync over any type of checksum?How to verify that rsync copied the device correctly when copy-devices is enabled?Does rsync require both source host and destination host to run rsync as client, server, or daemon?Usage of --remove-source-files option of rsyncrsync --delete not removing all deleted filesrsync does not preserve timestamp after failure
Why doesn't a const reference extend the life of a temporary object passed via a function?
LWC and complex parameters
What does 'script /dev/null' do?
What do the Banks children have against barley water?
Is "plugging out" electronic devices an American expression?
Pristine Bit Checking
Re-submission of rejected manuscript without informing co-authors
How would photo IDs work for shapeshifters?
What causes the sudden spool-up sound from an F-16 when enabling afterburner?
Why airport relocation isn't done gradually?
A poker game description that does not feel gimmicky
Why did the Germans forbid the possession of pet pigeons in Rostov-on-Don in 1941?
How can I add custom success page
Could a US political party gain complete control over the government by removing checks & balances?
Unbreakable Formation vs. Cry of the Carnarium
Is there any use for defining additional entity types in a SOQL FROM clause?
What happens when a metallic dragon and a chromatic dragon mate?
What is GPS' 19 year rollover and does it present a cybersecurity issue?
Why do we use polarized capacitors?
When blogging recipes, how can I support both readers who want the narrative/journey and ones who want the printer-friendly recipe?
Why do UK politicians seemingly ignore opinion polls on Brexit?
Filling an area between two curves
"My colleague's body is amazing"
Doomsday-clock for my fantasy planet
Does rsync verify files copied between two local drives?
Does rsync over any type of checksum?How do you empty the buffers and cache on a Linux system?Using rsync to move (not copy) files between directories?Reasons for rsync NOT transferring all files?Make rsync move (not copy) files on local file systemTo Rsync files where permission deniedDoes rsync over any type of checksum?How to verify that rsync copied the device correctly when copy-devices is enabled?Does rsync require both source host and destination host to run rsync as client, server, or daemon?Usage of --remove-source-files option of rsyncrsync --delete not removing all deleted filesrsync does not preserve timestamp after failure
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I want to make a fresh new copy of a large number of files from one local drive to another.
I've read that rsync does a checksum comparison of files when sending them to a remote machine over a network.
Will rsync make the comparison when copying the files between two local drives?
If it does do a verification - is it a safe bet? Or is it better to do a byte by byte comparison?
rsync verification
add a comment |
I want to make a fresh new copy of a large number of files from one local drive to another.
I've read that rsync does a checksum comparison of files when sending them to a remote machine over a network.
Will rsync make the comparison when copying the files between two local drives?
If it does do a verification - is it a safe bet? Or is it better to do a byte by byte comparison?
rsync verification
add a comment |
I want to make a fresh new copy of a large number of files from one local drive to another.
I've read that rsync does a checksum comparison of files when sending them to a remote machine over a network.
Will rsync make the comparison when copying the files between two local drives?
If it does do a verification - is it a safe bet? Or is it better to do a byte by byte comparison?
rsync verification
I want to make a fresh new copy of a large number of files from one local drive to another.
I've read that rsync does a checksum comparison of files when sending them to a remote machine over a network.
Will rsync make the comparison when copying the files between two local drives?
If it does do a verification - is it a safe bet? Or is it better to do a byte by byte comparison?
rsync verification
rsync verification
edited Feb 5 '12 at 23:05
Frez
asked Feb 5 '12 at 22:35
FrezFrez
418145
418145
add a comment |
add a comment |
5 Answers
5
active
oldest
votes
rsync always uses checksums to verify that a file was transferred correctly. If the destination file already exists, rsync may skip updating the file if the modification time and size match the source file, but if rsync decides that data need to be transferred, checksums are always used on the data transferred between the sending and receiving rsync processes. This verifies that the data received are the same as the data sent with high probability, without the heavy overhead of a byte-level comparison over the network.
Once the file data are received, rsync writes the data to the file and trusts that if the kernel indicates a successful write, the data were written without corruption to disk. rsync does not reread the data and compare against the known checksum as an additional check.
As for the verification itself, for protocol 30 and beyond (first supported in 3.0.0), rsync uses MD5. For older protocols, the checksum used is MD4.
While long considered obsolete for secure cryptographic hashes, MD5 and MD4 remain adequate for checking file corruption.
Source: the man page and eyeballing the rsync source code to verify.
3
I hate to burst everyone’s bubble but rsync only does check sum verification if the -c flag is added!
– user30825
Jan 21 '13 at 21:32
25
@clint No, the answer is correct. From the man page's explanation of the-c
flag: "Note that rsync always verifies that each transferred file was correctly reconstructed on the receiving side by checking a whole-file checksum that is generated as the file is transferred, but that automatic after-the-transfer verification has nothing to do with this option's before-the-transfer "Does this file need to be updated?" check."
– Michael Mrozek♦
Jan 21 '13 at 21:41
6
This answer does not make it clear if it actually verifies the file after a copy. If the checksum is computed as the file is being received, then it is not a post-copy checksum and you cannot be sure that the file is written correctly. You would then need to perform an additional comparison.
– Andre Miller
Mar 24 '15 at 21:26
3
@AndreMiller Thanks for the comment. I've updated the answer to address that issue.
– Kyle Jones
Mar 24 '15 at 22:02
7
Down-voting because I don't like the fact that this answer is detailed well written and technically correct and at the same time so much off topic that it misleads readers. The problem is that the answer goes into great detail on what happens during transfer while the questioner specifically states that he cares about local copies and not network transfers. I'm pretty sure Kyle Jones didn't want to mislead anyone but this answer (IMHO) does.
– ndemou
Jun 29 '16 at 19:38
|
show 5 more comments
rsync
does not do the post-copy verification for local file copies. You can verify that it does not by using rsync
to copy a large file to a slow (i.e. USB) drive, and then copying the same file with cp
, i.e.:
time rsync bigfile /mnt/usb/bigfile
time cp bigfile /mnt/usb/bigfile
Both commands take about the same amount of time, therefore rsync
cannot possibly be doing the checksum—since that would involve re-reading the destination file off the slow disk.
The man
page is unfortunately misleading about this. I also verified this with strace
—after the copy is complete, rsync
issues no read()
calls on the destination file, so it cannot be checksumming it. One more you can verify it is with something like iotop
: you see rsync
doing read and write simultaneously (copying from source to destination), then it exits. If it were verifying integrity, there would be a read-only phase.
1
"The man page is unfortunately misleading about this. I also verified this with strace" Did you strace the remote, running rsync process or the local one? There are two... one runs on the destination, even when you use ssh.
– user129070
May 6 '13 at 19:20
8
There is no post-copy verification for any copies, local or remote. You runrsync -c
again if you want to force it to check.
– psusi
May 6 '13 at 23:50
The verification is done on the incoming stream as it goes. It's not necessary to read it back from the disk if the filesystem has confirmed it's been written.
– OrangeDog
Jul 11 '18 at 15:51
add a comment |
rsync
makes a checksum comparison before copying (in some cases), to avoid copying what's already there. The point of the checksum comparison is not to verify that the copy was successful. That's the job of the underlying infrastructure: the filesystem drivers, the disk drivers, the network drivers, etc. Individual applications such as rsync
don't need to bother with this madness. All rsync
needs to do (and does!) is to check the return values of system calls to make sure there was no error.
1
This seems to contradict the accepted answer...
– djule5
Jan 13 '16 at 6:45
2
@djule5 In what way? The accepted answer seems to mostly be about how rsync checks transferred files, but the question, and my answer, are about local copies.
– Gilles
Jan 13 '16 at 10:16
3
Ok, well in that context I agree it makes more sense. So "The point of the checksum comparison is not to verify that the copy was successful" is true only for local copies; and "checksums are always used on the data transferred between the sending and receiving rsync processes" is true only for transferred copies. I find the accepted answer misleading in regard to the question and believe your answer should be the accepted one (just my 2 cents).
– djule5
Jan 13 '16 at 18:23
I still feel this answer is slightly misleading. For example, it says that the network drivers in particular verify if the copy was successful - but if you were saying that checksum comparison does not verify if the copy was successful for local only, network drivers would not come into play.
– Ken
Aug 7 '17 at 19:56
1
@Ken I don't understand the point you're trying to make. I suspect you misread something. The network drivers come into play only if there's a network copy. Rsync itself does a checksum comparison before doing any copy, in order to decide whether to copy. Rsync doesn't do any checksum comparison after copying (because it would be pointless: it knows what it's just copied).
– Gilles
Aug 7 '17 at 20:04
|
show 2 more comments
Quick and dirty answers, directly to the questions.
Q: Will rsync
make the comparison when copying the files between two local drives?
A: It will do comparison to figure out what to copy.
Q: If it does do a verification - is it a safe bet? Or is it better to do a byte by byte comparison?
A: as safe as the mathematics behind MD5 checksum of file. You can try to do simple experiment to learn and trust the tool.
Long answer: I guess, you wanted rsync
to do file comparison (bit by bit or by checksum) after copying files. If you are one of the few that value data integrity, you might find the below useful:
rsync -avh [source] [destination] && rsync -avhc [source] [destination]
above code rsync
files folder on first run and if complete without issue, will run rsync
again immediately while performing same file name comparison by using hash of entire file.
add a comment |
Using rsync to verify the integrity of a duplicate
To guarantee that this test physically re-reads the files from the drive media, I suggest powering-down both drives and restarting them before running this test. This will clear their internal volatile caches.
If not also restarting Linux, you should at least drop the caches (*) with:
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'
Then to re-read both trees and compare their checksums:
rsync --dry-run --checksum --itemize-changes --archive SRC DEST
Modern rsync checksum uses MD5, which is 128 bits. The likelihood of this failing to detect an error in an individual file is astronomically low (some discussion here), but not impossible.
stackoverflow.com/questions/4493525/…
– nobar
Apr 5 at 21:20
Good luck getting the trailing slashes right.
– nobar
Apr 5 at 21:22
No news is good news.
– nobar
Apr 5 at 21:24
Don't bother with--checksum
until the test has passed without it.
– nobar
Apr 6 at 1:11
add a comment |
protected by Community♦ May 7 '13 at 2:47
Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).
Would you like to answer one of these unanswered questions instead?
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
rsync always uses checksums to verify that a file was transferred correctly. If the destination file already exists, rsync may skip updating the file if the modification time and size match the source file, but if rsync decides that data need to be transferred, checksums are always used on the data transferred between the sending and receiving rsync processes. This verifies that the data received are the same as the data sent with high probability, without the heavy overhead of a byte-level comparison over the network.
Once the file data are received, rsync writes the data to the file and trusts that if the kernel indicates a successful write, the data were written without corruption to disk. rsync does not reread the data and compare against the known checksum as an additional check.
As for the verification itself, for protocol 30 and beyond (first supported in 3.0.0), rsync uses MD5. For older protocols, the checksum used is MD4.
While long considered obsolete for secure cryptographic hashes, MD5 and MD4 remain adequate for checking file corruption.
Source: the man page and eyeballing the rsync source code to verify.
3
I hate to burst everyone’s bubble but rsync only does check sum verification if the -c flag is added!
– user30825
Jan 21 '13 at 21:32
25
@clint No, the answer is correct. From the man page's explanation of the-c
flag: "Note that rsync always verifies that each transferred file was correctly reconstructed on the receiving side by checking a whole-file checksum that is generated as the file is transferred, but that automatic after-the-transfer verification has nothing to do with this option's before-the-transfer "Does this file need to be updated?" check."
– Michael Mrozek♦
Jan 21 '13 at 21:41
6
This answer does not make it clear if it actually verifies the file after a copy. If the checksum is computed as the file is being received, then it is not a post-copy checksum and you cannot be sure that the file is written correctly. You would then need to perform an additional comparison.
– Andre Miller
Mar 24 '15 at 21:26
3
@AndreMiller Thanks for the comment. I've updated the answer to address that issue.
– Kyle Jones
Mar 24 '15 at 22:02
7
Down-voting because I don't like the fact that this answer is detailed well written and technically correct and at the same time so much off topic that it misleads readers. The problem is that the answer goes into great detail on what happens during transfer while the questioner specifically states that he cares about local copies and not network transfers. I'm pretty sure Kyle Jones didn't want to mislead anyone but this answer (IMHO) does.
– ndemou
Jun 29 '16 at 19:38
|
show 5 more comments
rsync always uses checksums to verify that a file was transferred correctly. If the destination file already exists, rsync may skip updating the file if the modification time and size match the source file, but if rsync decides that data need to be transferred, checksums are always used on the data transferred between the sending and receiving rsync processes. This verifies that the data received are the same as the data sent with high probability, without the heavy overhead of a byte-level comparison over the network.
Once the file data are received, rsync writes the data to the file and trusts that if the kernel indicates a successful write, the data were written without corruption to disk. rsync does not reread the data and compare against the known checksum as an additional check.
As for the verification itself, for protocol 30 and beyond (first supported in 3.0.0), rsync uses MD5. For older protocols, the checksum used is MD4.
While long considered obsolete for secure cryptographic hashes, MD5 and MD4 remain adequate for checking file corruption.
Source: the man page and eyeballing the rsync source code to verify.
3
I hate to burst everyone’s bubble but rsync only does check sum verification if the -c flag is added!
– user30825
Jan 21 '13 at 21:32
25
@clint No, the answer is correct. From the man page's explanation of the-c
flag: "Note that rsync always verifies that each transferred file was correctly reconstructed on the receiving side by checking a whole-file checksum that is generated as the file is transferred, but that automatic after-the-transfer verification has nothing to do with this option's before-the-transfer "Does this file need to be updated?" check."
– Michael Mrozek♦
Jan 21 '13 at 21:41
6
This answer does not make it clear if it actually verifies the file after a copy. If the checksum is computed as the file is being received, then it is not a post-copy checksum and you cannot be sure that the file is written correctly. You would then need to perform an additional comparison.
– Andre Miller
Mar 24 '15 at 21:26
3
@AndreMiller Thanks for the comment. I've updated the answer to address that issue.
– Kyle Jones
Mar 24 '15 at 22:02
7
Down-voting because I don't like the fact that this answer is detailed well written and technically correct and at the same time so much off topic that it misleads readers. The problem is that the answer goes into great detail on what happens during transfer while the questioner specifically states that he cares about local copies and not network transfers. I'm pretty sure Kyle Jones didn't want to mislead anyone but this answer (IMHO) does.
– ndemou
Jun 29 '16 at 19:38
|
show 5 more comments
rsync always uses checksums to verify that a file was transferred correctly. If the destination file already exists, rsync may skip updating the file if the modification time and size match the source file, but if rsync decides that data need to be transferred, checksums are always used on the data transferred between the sending and receiving rsync processes. This verifies that the data received are the same as the data sent with high probability, without the heavy overhead of a byte-level comparison over the network.
Once the file data are received, rsync writes the data to the file and trusts that if the kernel indicates a successful write, the data were written without corruption to disk. rsync does not reread the data and compare against the known checksum as an additional check.
As for the verification itself, for protocol 30 and beyond (first supported in 3.0.0), rsync uses MD5. For older protocols, the checksum used is MD4.
While long considered obsolete for secure cryptographic hashes, MD5 and MD4 remain adequate for checking file corruption.
Source: the man page and eyeballing the rsync source code to verify.
rsync always uses checksums to verify that a file was transferred correctly. If the destination file already exists, rsync may skip updating the file if the modification time and size match the source file, but if rsync decides that data need to be transferred, checksums are always used on the data transferred between the sending and receiving rsync processes. This verifies that the data received are the same as the data sent with high probability, without the heavy overhead of a byte-level comparison over the network.
Once the file data are received, rsync writes the data to the file and trusts that if the kernel indicates a successful write, the data were written without corruption to disk. rsync does not reread the data and compare against the known checksum as an additional check.
As for the verification itself, for protocol 30 and beyond (first supported in 3.0.0), rsync uses MD5. For older protocols, the checksum used is MD4.
While long considered obsolete for secure cryptographic hashes, MD5 and MD4 remain adequate for checking file corruption.
Source: the man page and eyeballing the rsync source code to verify.
edited Dec 13 '16 at 2:03
answered Feb 5 '12 at 23:42
Kyle JonesKyle Jones
11.7k23149
11.7k23149
3
I hate to burst everyone’s bubble but rsync only does check sum verification if the -c flag is added!
– user30825
Jan 21 '13 at 21:32
25
@clint No, the answer is correct. From the man page's explanation of the-c
flag: "Note that rsync always verifies that each transferred file was correctly reconstructed on the receiving side by checking a whole-file checksum that is generated as the file is transferred, but that automatic after-the-transfer verification has nothing to do with this option's before-the-transfer "Does this file need to be updated?" check."
– Michael Mrozek♦
Jan 21 '13 at 21:41
6
This answer does not make it clear if it actually verifies the file after a copy. If the checksum is computed as the file is being received, then it is not a post-copy checksum and you cannot be sure that the file is written correctly. You would then need to perform an additional comparison.
– Andre Miller
Mar 24 '15 at 21:26
3
@AndreMiller Thanks for the comment. I've updated the answer to address that issue.
– Kyle Jones
Mar 24 '15 at 22:02
7
Down-voting because I don't like the fact that this answer is detailed well written and technically correct and at the same time so much off topic that it misleads readers. The problem is that the answer goes into great detail on what happens during transfer while the questioner specifically states that he cares about local copies and not network transfers. I'm pretty sure Kyle Jones didn't want to mislead anyone but this answer (IMHO) does.
– ndemou
Jun 29 '16 at 19:38
|
show 5 more comments
3
I hate to burst everyone’s bubble but rsync only does check sum verification if the -c flag is added!
– user30825
Jan 21 '13 at 21:32
25
@clint No, the answer is correct. From the man page's explanation of the-c
flag: "Note that rsync always verifies that each transferred file was correctly reconstructed on the receiving side by checking a whole-file checksum that is generated as the file is transferred, but that automatic after-the-transfer verification has nothing to do with this option's before-the-transfer "Does this file need to be updated?" check."
– Michael Mrozek♦
Jan 21 '13 at 21:41
6
This answer does not make it clear if it actually verifies the file after a copy. If the checksum is computed as the file is being received, then it is not a post-copy checksum and you cannot be sure that the file is written correctly. You would then need to perform an additional comparison.
– Andre Miller
Mar 24 '15 at 21:26
3
@AndreMiller Thanks for the comment. I've updated the answer to address that issue.
– Kyle Jones
Mar 24 '15 at 22:02
7
Down-voting because I don't like the fact that this answer is detailed well written and technically correct and at the same time so much off topic that it misleads readers. The problem is that the answer goes into great detail on what happens during transfer while the questioner specifically states that he cares about local copies and not network transfers. I'm pretty sure Kyle Jones didn't want to mislead anyone but this answer (IMHO) does.
– ndemou
Jun 29 '16 at 19:38
3
3
I hate to burst everyone’s bubble but rsync only does check sum verification if the -c flag is added!
– user30825
Jan 21 '13 at 21:32
I hate to burst everyone’s bubble but rsync only does check sum verification if the -c flag is added!
– user30825
Jan 21 '13 at 21:32
25
25
@clint No, the answer is correct. From the man page's explanation of the
-c
flag: "Note that rsync always verifies that each transferred file was correctly reconstructed on the receiving side by checking a whole-file checksum that is generated as the file is transferred, but that automatic after-the-transfer verification has nothing to do with this option's before-the-transfer "Does this file need to be updated?" check."– Michael Mrozek♦
Jan 21 '13 at 21:41
@clint No, the answer is correct. From the man page's explanation of the
-c
flag: "Note that rsync always verifies that each transferred file was correctly reconstructed on the receiving side by checking a whole-file checksum that is generated as the file is transferred, but that automatic after-the-transfer verification has nothing to do with this option's before-the-transfer "Does this file need to be updated?" check."– Michael Mrozek♦
Jan 21 '13 at 21:41
6
6
This answer does not make it clear if it actually verifies the file after a copy. If the checksum is computed as the file is being received, then it is not a post-copy checksum and you cannot be sure that the file is written correctly. You would then need to perform an additional comparison.
– Andre Miller
Mar 24 '15 at 21:26
This answer does not make it clear if it actually verifies the file after a copy. If the checksum is computed as the file is being received, then it is not a post-copy checksum and you cannot be sure that the file is written correctly. You would then need to perform an additional comparison.
– Andre Miller
Mar 24 '15 at 21:26
3
3
@AndreMiller Thanks for the comment. I've updated the answer to address that issue.
– Kyle Jones
Mar 24 '15 at 22:02
@AndreMiller Thanks for the comment. I've updated the answer to address that issue.
– Kyle Jones
Mar 24 '15 at 22:02
7
7
Down-voting because I don't like the fact that this answer is detailed well written and technically correct and at the same time so much off topic that it misleads readers. The problem is that the answer goes into great detail on what happens during transfer while the questioner specifically states that he cares about local copies and not network transfers. I'm pretty sure Kyle Jones didn't want to mislead anyone but this answer (IMHO) does.
– ndemou
Jun 29 '16 at 19:38
Down-voting because I don't like the fact that this answer is detailed well written and technically correct and at the same time so much off topic that it misleads readers. The problem is that the answer goes into great detail on what happens during transfer while the questioner specifically states that he cares about local copies and not network transfers. I'm pretty sure Kyle Jones didn't want to mislead anyone but this answer (IMHO) does.
– ndemou
Jun 29 '16 at 19:38
|
show 5 more comments
rsync
does not do the post-copy verification for local file copies. You can verify that it does not by using rsync
to copy a large file to a slow (i.e. USB) drive, and then copying the same file with cp
, i.e.:
time rsync bigfile /mnt/usb/bigfile
time cp bigfile /mnt/usb/bigfile
Both commands take about the same amount of time, therefore rsync
cannot possibly be doing the checksum—since that would involve re-reading the destination file off the slow disk.
The man
page is unfortunately misleading about this. I also verified this with strace
—after the copy is complete, rsync
issues no read()
calls on the destination file, so it cannot be checksumming it. One more you can verify it is with something like iotop
: you see rsync
doing read and write simultaneously (copying from source to destination), then it exits. If it were verifying integrity, there would be a read-only phase.
1
"The man page is unfortunately misleading about this. I also verified this with strace" Did you strace the remote, running rsync process or the local one? There are two... one runs on the destination, even when you use ssh.
– user129070
May 6 '13 at 19:20
8
There is no post-copy verification for any copies, local or remote. You runrsync -c
again if you want to force it to check.
– psusi
May 6 '13 at 23:50
The verification is done on the incoming stream as it goes. It's not necessary to read it back from the disk if the filesystem has confirmed it's been written.
– OrangeDog
Jul 11 '18 at 15:51
add a comment |
rsync
does not do the post-copy verification for local file copies. You can verify that it does not by using rsync
to copy a large file to a slow (i.e. USB) drive, and then copying the same file with cp
, i.e.:
time rsync bigfile /mnt/usb/bigfile
time cp bigfile /mnt/usb/bigfile
Both commands take about the same amount of time, therefore rsync
cannot possibly be doing the checksum—since that would involve re-reading the destination file off the slow disk.
The man
page is unfortunately misleading about this. I also verified this with strace
—after the copy is complete, rsync
issues no read()
calls on the destination file, so it cannot be checksumming it. One more you can verify it is with something like iotop
: you see rsync
doing read and write simultaneously (copying from source to destination), then it exits. If it were verifying integrity, there would be a read-only phase.
1
"The man page is unfortunately misleading about this. I also verified this with strace" Did you strace the remote, running rsync process or the local one? There are two... one runs on the destination, even when you use ssh.
– user129070
May 6 '13 at 19:20
8
There is no post-copy verification for any copies, local or remote. You runrsync -c
again if you want to force it to check.
– psusi
May 6 '13 at 23:50
The verification is done on the incoming stream as it goes. It's not necessary to read it back from the disk if the filesystem has confirmed it's been written.
– OrangeDog
Jul 11 '18 at 15:51
add a comment |
rsync
does not do the post-copy verification for local file copies. You can verify that it does not by using rsync
to copy a large file to a slow (i.e. USB) drive, and then copying the same file with cp
, i.e.:
time rsync bigfile /mnt/usb/bigfile
time cp bigfile /mnt/usb/bigfile
Both commands take about the same amount of time, therefore rsync
cannot possibly be doing the checksum—since that would involve re-reading the destination file off the slow disk.
The man
page is unfortunately misleading about this. I also verified this with strace
—after the copy is complete, rsync
issues no read()
calls on the destination file, so it cannot be checksumming it. One more you can verify it is with something like iotop
: you see rsync
doing read and write simultaneously (copying from source to destination), then it exits. If it were verifying integrity, there would be a read-only phase.
rsync
does not do the post-copy verification for local file copies. You can verify that it does not by using rsync
to copy a large file to a slow (i.e. USB) drive, and then copying the same file with cp
, i.e.:
time rsync bigfile /mnt/usb/bigfile
time cp bigfile /mnt/usb/bigfile
Both commands take about the same amount of time, therefore rsync
cannot possibly be doing the checksum—since that would involve re-reading the destination file off the slow disk.
The man
page is unfortunately misleading about this. I also verified this with strace
—after the copy is complete, rsync
issues no read()
calls on the destination file, so it cannot be checksumming it. One more you can verify it is with something like iotop
: you see rsync
doing read and write simultaneously (copying from source to destination), then it exits. If it were verifying integrity, there would be a read-only phase.
edited Mar 3 '13 at 7:45
jasonwryan
50.8k14135190
50.8k14135190
answered Mar 3 '13 at 6:37
FelixFelix
39932
39932
1
"The man page is unfortunately misleading about this. I also verified this with strace" Did you strace the remote, running rsync process or the local one? There are two... one runs on the destination, even when you use ssh.
– user129070
May 6 '13 at 19:20
8
There is no post-copy verification for any copies, local or remote. You runrsync -c
again if you want to force it to check.
– psusi
May 6 '13 at 23:50
The verification is done on the incoming stream as it goes. It's not necessary to read it back from the disk if the filesystem has confirmed it's been written.
– OrangeDog
Jul 11 '18 at 15:51
add a comment |
1
"The man page is unfortunately misleading about this. I also verified this with strace" Did you strace the remote, running rsync process or the local one? There are two... one runs on the destination, even when you use ssh.
– user129070
May 6 '13 at 19:20
8
There is no post-copy verification for any copies, local or remote. You runrsync -c
again if you want to force it to check.
– psusi
May 6 '13 at 23:50
The verification is done on the incoming stream as it goes. It's not necessary to read it back from the disk if the filesystem has confirmed it's been written.
– OrangeDog
Jul 11 '18 at 15:51
1
1
"The man page is unfortunately misleading about this. I also verified this with strace" Did you strace the remote, running rsync process or the local one? There are two... one runs on the destination, even when you use ssh.
– user129070
May 6 '13 at 19:20
"The man page is unfortunately misleading about this. I also verified this with strace" Did you strace the remote, running rsync process or the local one? There are two... one runs on the destination, even when you use ssh.
– user129070
May 6 '13 at 19:20
8
8
There is no post-copy verification for any copies, local or remote. You run
rsync -c
again if you want to force it to check.– psusi
May 6 '13 at 23:50
There is no post-copy verification for any copies, local or remote. You run
rsync -c
again if you want to force it to check.– psusi
May 6 '13 at 23:50
The verification is done on the incoming stream as it goes. It's not necessary to read it back from the disk if the filesystem has confirmed it's been written.
– OrangeDog
Jul 11 '18 at 15:51
The verification is done on the incoming stream as it goes. It's not necessary to read it back from the disk if the filesystem has confirmed it's been written.
– OrangeDog
Jul 11 '18 at 15:51
add a comment |
rsync
makes a checksum comparison before copying (in some cases), to avoid copying what's already there. The point of the checksum comparison is not to verify that the copy was successful. That's the job of the underlying infrastructure: the filesystem drivers, the disk drivers, the network drivers, etc. Individual applications such as rsync
don't need to bother with this madness. All rsync
needs to do (and does!) is to check the return values of system calls to make sure there was no error.
1
This seems to contradict the accepted answer...
– djule5
Jan 13 '16 at 6:45
2
@djule5 In what way? The accepted answer seems to mostly be about how rsync checks transferred files, but the question, and my answer, are about local copies.
– Gilles
Jan 13 '16 at 10:16
3
Ok, well in that context I agree it makes more sense. So "The point of the checksum comparison is not to verify that the copy was successful" is true only for local copies; and "checksums are always used on the data transferred between the sending and receiving rsync processes" is true only for transferred copies. I find the accepted answer misleading in regard to the question and believe your answer should be the accepted one (just my 2 cents).
– djule5
Jan 13 '16 at 18:23
I still feel this answer is slightly misleading. For example, it says that the network drivers in particular verify if the copy was successful - but if you were saying that checksum comparison does not verify if the copy was successful for local only, network drivers would not come into play.
– Ken
Aug 7 '17 at 19:56
1
@Ken I don't understand the point you're trying to make. I suspect you misread something. The network drivers come into play only if there's a network copy. Rsync itself does a checksum comparison before doing any copy, in order to decide whether to copy. Rsync doesn't do any checksum comparison after copying (because it would be pointless: it knows what it's just copied).
– Gilles
Aug 7 '17 at 20:04
|
show 2 more comments
rsync
makes a checksum comparison before copying (in some cases), to avoid copying what's already there. The point of the checksum comparison is not to verify that the copy was successful. That's the job of the underlying infrastructure: the filesystem drivers, the disk drivers, the network drivers, etc. Individual applications such as rsync
don't need to bother with this madness. All rsync
needs to do (and does!) is to check the return values of system calls to make sure there was no error.
1
This seems to contradict the accepted answer...
– djule5
Jan 13 '16 at 6:45
2
@djule5 In what way? The accepted answer seems to mostly be about how rsync checks transferred files, but the question, and my answer, are about local copies.
– Gilles
Jan 13 '16 at 10:16
3
Ok, well in that context I agree it makes more sense. So "The point of the checksum comparison is not to verify that the copy was successful" is true only for local copies; and "checksums are always used on the data transferred between the sending and receiving rsync processes" is true only for transferred copies. I find the accepted answer misleading in regard to the question and believe your answer should be the accepted one (just my 2 cents).
– djule5
Jan 13 '16 at 18:23
I still feel this answer is slightly misleading. For example, it says that the network drivers in particular verify if the copy was successful - but if you were saying that checksum comparison does not verify if the copy was successful for local only, network drivers would not come into play.
– Ken
Aug 7 '17 at 19:56
1
@Ken I don't understand the point you're trying to make. I suspect you misread something. The network drivers come into play only if there's a network copy. Rsync itself does a checksum comparison before doing any copy, in order to decide whether to copy. Rsync doesn't do any checksum comparison after copying (because it would be pointless: it knows what it's just copied).
– Gilles
Aug 7 '17 at 20:04
|
show 2 more comments
rsync
makes a checksum comparison before copying (in some cases), to avoid copying what's already there. The point of the checksum comparison is not to verify that the copy was successful. That's the job of the underlying infrastructure: the filesystem drivers, the disk drivers, the network drivers, etc. Individual applications such as rsync
don't need to bother with this madness. All rsync
needs to do (and does!) is to check the return values of system calls to make sure there was no error.
rsync
makes a checksum comparison before copying (in some cases), to avoid copying what's already there. The point of the checksum comparison is not to verify that the copy was successful. That's the job of the underlying infrastructure: the filesystem drivers, the disk drivers, the network drivers, etc. Individual applications such as rsync
don't need to bother with this madness. All rsync
needs to do (and does!) is to check the return values of system calls to make sure there was no error.
answered Feb 5 '12 at 23:10
GillesGilles
546k13011131626
546k13011131626
1
This seems to contradict the accepted answer...
– djule5
Jan 13 '16 at 6:45
2
@djule5 In what way? The accepted answer seems to mostly be about how rsync checks transferred files, but the question, and my answer, are about local copies.
– Gilles
Jan 13 '16 at 10:16
3
Ok, well in that context I agree it makes more sense. So "The point of the checksum comparison is not to verify that the copy was successful" is true only for local copies; and "checksums are always used on the data transferred between the sending and receiving rsync processes" is true only for transferred copies. I find the accepted answer misleading in regard to the question and believe your answer should be the accepted one (just my 2 cents).
– djule5
Jan 13 '16 at 18:23
I still feel this answer is slightly misleading. For example, it says that the network drivers in particular verify if the copy was successful - but if you were saying that checksum comparison does not verify if the copy was successful for local only, network drivers would not come into play.
– Ken
Aug 7 '17 at 19:56
1
@Ken I don't understand the point you're trying to make. I suspect you misread something. The network drivers come into play only if there's a network copy. Rsync itself does a checksum comparison before doing any copy, in order to decide whether to copy. Rsync doesn't do any checksum comparison after copying (because it would be pointless: it knows what it's just copied).
– Gilles
Aug 7 '17 at 20:04
|
show 2 more comments
1
This seems to contradict the accepted answer...
– djule5
Jan 13 '16 at 6:45
2
@djule5 In what way? The accepted answer seems to mostly be about how rsync checks transferred files, but the question, and my answer, are about local copies.
– Gilles
Jan 13 '16 at 10:16
3
Ok, well in that context I agree it makes more sense. So "The point of the checksum comparison is not to verify that the copy was successful" is true only for local copies; and "checksums are always used on the data transferred between the sending and receiving rsync processes" is true only for transferred copies. I find the accepted answer misleading in regard to the question and believe your answer should be the accepted one (just my 2 cents).
– djule5
Jan 13 '16 at 18:23
I still feel this answer is slightly misleading. For example, it says that the network drivers in particular verify if the copy was successful - but if you were saying that checksum comparison does not verify if the copy was successful for local only, network drivers would not come into play.
– Ken
Aug 7 '17 at 19:56
1
@Ken I don't understand the point you're trying to make. I suspect you misread something. The network drivers come into play only if there's a network copy. Rsync itself does a checksum comparison before doing any copy, in order to decide whether to copy. Rsync doesn't do any checksum comparison after copying (because it would be pointless: it knows what it's just copied).
– Gilles
Aug 7 '17 at 20:04
1
1
This seems to contradict the accepted answer...
– djule5
Jan 13 '16 at 6:45
This seems to contradict the accepted answer...
– djule5
Jan 13 '16 at 6:45
2
2
@djule5 In what way? The accepted answer seems to mostly be about how rsync checks transferred files, but the question, and my answer, are about local copies.
– Gilles
Jan 13 '16 at 10:16
@djule5 In what way? The accepted answer seems to mostly be about how rsync checks transferred files, but the question, and my answer, are about local copies.
– Gilles
Jan 13 '16 at 10:16
3
3
Ok, well in that context I agree it makes more sense. So "The point of the checksum comparison is not to verify that the copy was successful" is true only for local copies; and "checksums are always used on the data transferred between the sending and receiving rsync processes" is true only for transferred copies. I find the accepted answer misleading in regard to the question and believe your answer should be the accepted one (just my 2 cents).
– djule5
Jan 13 '16 at 18:23
Ok, well in that context I agree it makes more sense. So "The point of the checksum comparison is not to verify that the copy was successful" is true only for local copies; and "checksums are always used on the data transferred between the sending and receiving rsync processes" is true only for transferred copies. I find the accepted answer misleading in regard to the question and believe your answer should be the accepted one (just my 2 cents).
– djule5
Jan 13 '16 at 18:23
I still feel this answer is slightly misleading. For example, it says that the network drivers in particular verify if the copy was successful - but if you were saying that checksum comparison does not verify if the copy was successful for local only, network drivers would not come into play.
– Ken
Aug 7 '17 at 19:56
I still feel this answer is slightly misleading. For example, it says that the network drivers in particular verify if the copy was successful - but if you were saying that checksum comparison does not verify if the copy was successful for local only, network drivers would not come into play.
– Ken
Aug 7 '17 at 19:56
1
1
@Ken I don't understand the point you're trying to make. I suspect you misread something. The network drivers come into play only if there's a network copy. Rsync itself does a checksum comparison before doing any copy, in order to decide whether to copy. Rsync doesn't do any checksum comparison after copying (because it would be pointless: it knows what it's just copied).
– Gilles
Aug 7 '17 at 20:04
@Ken I don't understand the point you're trying to make. I suspect you misread something. The network drivers come into play only if there's a network copy. Rsync itself does a checksum comparison before doing any copy, in order to decide whether to copy. Rsync doesn't do any checksum comparison after copying (because it would be pointless: it knows what it's just copied).
– Gilles
Aug 7 '17 at 20:04
|
show 2 more comments
Quick and dirty answers, directly to the questions.
Q: Will rsync
make the comparison when copying the files between two local drives?
A: It will do comparison to figure out what to copy.
Q: If it does do a verification - is it a safe bet? Or is it better to do a byte by byte comparison?
A: as safe as the mathematics behind MD5 checksum of file. You can try to do simple experiment to learn and trust the tool.
Long answer: I guess, you wanted rsync
to do file comparison (bit by bit or by checksum) after copying files. If you are one of the few that value data integrity, you might find the below useful:
rsync -avh [source] [destination] && rsync -avhc [source] [destination]
above code rsync
files folder on first run and if complete without issue, will run rsync
again immediately while performing same file name comparison by using hash of entire file.
add a comment |
Quick and dirty answers, directly to the questions.
Q: Will rsync
make the comparison when copying the files between two local drives?
A: It will do comparison to figure out what to copy.
Q: If it does do a verification - is it a safe bet? Or is it better to do a byte by byte comparison?
A: as safe as the mathematics behind MD5 checksum of file. You can try to do simple experiment to learn and trust the tool.
Long answer: I guess, you wanted rsync
to do file comparison (bit by bit or by checksum) after copying files. If you are one of the few that value data integrity, you might find the below useful:
rsync -avh [source] [destination] && rsync -avhc [source] [destination]
above code rsync
files folder on first run and if complete without issue, will run rsync
again immediately while performing same file name comparison by using hash of entire file.
add a comment |
Quick and dirty answers, directly to the questions.
Q: Will rsync
make the comparison when copying the files between two local drives?
A: It will do comparison to figure out what to copy.
Q: If it does do a verification - is it a safe bet? Or is it better to do a byte by byte comparison?
A: as safe as the mathematics behind MD5 checksum of file. You can try to do simple experiment to learn and trust the tool.
Long answer: I guess, you wanted rsync
to do file comparison (bit by bit or by checksum) after copying files. If you are one of the few that value data integrity, you might find the below useful:
rsync -avh [source] [destination] && rsync -avhc [source] [destination]
above code rsync
files folder on first run and if complete without issue, will run rsync
again immediately while performing same file name comparison by using hash of entire file.
Quick and dirty answers, directly to the questions.
Q: Will rsync
make the comparison when copying the files between two local drives?
A: It will do comparison to figure out what to copy.
Q: If it does do a verification - is it a safe bet? Or is it better to do a byte by byte comparison?
A: as safe as the mathematics behind MD5 checksum of file. You can try to do simple experiment to learn and trust the tool.
Long answer: I guess, you wanted rsync
to do file comparison (bit by bit or by checksum) after copying files. If you are one of the few that value data integrity, you might find the below useful:
rsync -avh [source] [destination] && rsync -avhc [source] [destination]
above code rsync
files folder on first run and if complete without issue, will run rsync
again immediately while performing same file name comparison by using hash of entire file.
edited Feb 11 at 20:42
James K Polk
1033
1033
answered Nov 28 '18 at 5:29
M.N.M.N.
464
464
add a comment |
add a comment |
Using rsync to verify the integrity of a duplicate
To guarantee that this test physically re-reads the files from the drive media, I suggest powering-down both drives and restarting them before running this test. This will clear their internal volatile caches.
If not also restarting Linux, you should at least drop the caches (*) with:
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'
Then to re-read both trees and compare their checksums:
rsync --dry-run --checksum --itemize-changes --archive SRC DEST
Modern rsync checksum uses MD5, which is 128 bits. The likelihood of this failing to detect an error in an individual file is astronomically low (some discussion here), but not impossible.
stackoverflow.com/questions/4493525/…
– nobar
Apr 5 at 21:20
Good luck getting the trailing slashes right.
– nobar
Apr 5 at 21:22
No news is good news.
– nobar
Apr 5 at 21:24
Don't bother with--checksum
until the test has passed without it.
– nobar
Apr 6 at 1:11
add a comment |
Using rsync to verify the integrity of a duplicate
To guarantee that this test physically re-reads the files from the drive media, I suggest powering-down both drives and restarting them before running this test. This will clear their internal volatile caches.
If not also restarting Linux, you should at least drop the caches (*) with:
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'
Then to re-read both trees and compare their checksums:
rsync --dry-run --checksum --itemize-changes --archive SRC DEST
Modern rsync checksum uses MD5, which is 128 bits. The likelihood of this failing to detect an error in an individual file is astronomically low (some discussion here), but not impossible.
stackoverflow.com/questions/4493525/…
– nobar
Apr 5 at 21:20
Good luck getting the trailing slashes right.
– nobar
Apr 5 at 21:22
No news is good news.
– nobar
Apr 5 at 21:24
Don't bother with--checksum
until the test has passed without it.
– nobar
Apr 6 at 1:11
add a comment |
Using rsync to verify the integrity of a duplicate
To guarantee that this test physically re-reads the files from the drive media, I suggest powering-down both drives and restarting them before running this test. This will clear their internal volatile caches.
If not also restarting Linux, you should at least drop the caches (*) with:
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'
Then to re-read both trees and compare their checksums:
rsync --dry-run --checksum --itemize-changes --archive SRC DEST
Modern rsync checksum uses MD5, which is 128 bits. The likelihood of this failing to detect an error in an individual file is astronomically low (some discussion here), but not impossible.
Using rsync to verify the integrity of a duplicate
To guarantee that this test physically re-reads the files from the drive media, I suggest powering-down both drives and restarting them before running this test. This will clear their internal volatile caches.
If not also restarting Linux, you should at least drop the caches (*) with:
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'
Then to re-read both trees and compare their checksums:
rsync --dry-run --checksum --itemize-changes --archive SRC DEST
Modern rsync checksum uses MD5, which is 128 bits. The likelihood of this failing to detect an error in an individual file is astronomically low (some discussion here), but not impossible.
edited Apr 5 at 20:37
answered Apr 5 at 20:19
nobarnobar
6231818
6231818
stackoverflow.com/questions/4493525/…
– nobar
Apr 5 at 21:20
Good luck getting the trailing slashes right.
– nobar
Apr 5 at 21:22
No news is good news.
– nobar
Apr 5 at 21:24
Don't bother with--checksum
until the test has passed without it.
– nobar
Apr 6 at 1:11
add a comment |
stackoverflow.com/questions/4493525/…
– nobar
Apr 5 at 21:20
Good luck getting the trailing slashes right.
– nobar
Apr 5 at 21:22
No news is good news.
– nobar
Apr 5 at 21:24
Don't bother with--checksum
until the test has passed without it.
– nobar
Apr 6 at 1:11
stackoverflow.com/questions/4493525/…
– nobar
Apr 5 at 21:20
stackoverflow.com/questions/4493525/…
– nobar
Apr 5 at 21:20
Good luck getting the trailing slashes right.
– nobar
Apr 5 at 21:22
Good luck getting the trailing slashes right.
– nobar
Apr 5 at 21:22
No news is good news.
– nobar
Apr 5 at 21:24
No news is good news.
– nobar
Apr 5 at 21:24
Don't bother with
--checksum
until the test has passed without it.– nobar
Apr 6 at 1:11
Don't bother with
--checksum
until the test has passed without it.– nobar
Apr 6 at 1:11
add a comment |
protected by Community♦ May 7 '13 at 2:47
Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).
Would you like to answer one of these unanswered questions instead?