[mdlug] rsync scenarios
Adam Tauno Williams
awilliam at opengroupware.us
Thu Mar 10 07:54:56 EST 2011
On Wed, 2011-03-09 at 18:10 -0500, Aaron Kulkis wrote:
> Michael ORourke wrote:
> > Lug Nuts,
> > I have a couple of scenarios that I was trying to figure out.
> > First scenario...
> > What happens if you execute multiple rsync processes against the same directory at about the same time. Will it cause problems?
Yes, in general filesystems are bad for concurrent access [unless the
*application* is smart enough to make provisions for it].
> > Suppose you have an automated process, that checks a directory on Srv-1/dir1,
> > then fires off a rsync process to Srv-2/dir1, what happens if the first rsync
> > doesn't finish before another rsync process is fired off?
You end up with two concurrent rsync's running. Which may not hurt
anything, but falls into the "undefined behavior" category.
> > Ideally, I would like to have it wait, then re-execute or throw an
> > error (rsync in process or something). Of course I could write a
> > wrapper script that sets a lockfile when the rsync process starts,
> > and clears it when it finishes. So I could guarantee that only one
> > rsync process is running at a given time.
+1
> > Another scenario... what if you have a process that is writing to a
> > file on Srv-1, but hasn't completed yet when the first rsync process
> > fires off? Does that file get skipped, or does it get partially
> > rsync'd?
Partial, filesystem backups *always* require applications to be
quiescent.
> > Anyone know how rsync handles these 2 different scenarios?
Simply, it doesn't. The only safe scenario would be if rsync attempted
to acquire an exclusive lock on its current files of interest during the
process; I checked once upon a time an rsync didn't seem to file-lock
at all.
Of course there is the related issue of two separate files whose
contents are correlated: A & B. rsync copies B, application writes to A
& B, rsync copies A. When you try to use those files from your backup
the app just pukes mysteriously.
rsync is an awesome tool, I'm not knocking it, I use it, but it is what
it is - a filesystem level too. Since the filesystem doesn't understand
the applications / sites data-model... the resulting copy may be
inconsistent unless external provisions are made to prevent that
condition [because *you* know about the applications / sites
data-model].
> But in general, the rule is most likely this:
> If two rsync processes are READING from the same location(s),
> but WRITING to different locations, then everything should
> be safe. In any other scenario (both writing to the same
> paths and/or one writing to the same place another is reading
> from), then you risk a race condition which is destructive.
I think the condition that results, more likely that a race, is just
scrambled data.
More information about the mdlug
mailing list