[mdlug] GNU sed weirdness

Jeff Hanson jhansonxi at gmail.com
Sat Jan 23 21:11:49 EST 2010


With GNU sed on Ubuntu 8.04 I'm wrote a simple script that takes in a
Windows full-path file listing and changes it to a Windows 7-zip
compression batch file.  It uses back references get the parent
directory path and filename.  There are several files that have the
same name but with different extensions and I just want to put all the
ones that match by filename into the same 7z file.  I wanted to use
variables for the 7-zip path, target directory, and compression
options.  Because of spaces in the directory and file names I need to
embed quotes.  To keep the sed command from getting really ugly I
split it up into a single-quoted search, double-quoted (because of
variable expansions) replacement for the variables followed by
single-quoted replacement for the back references.  The back
references \1 holds the directory path and \2 holds the filename
without extension.  What I though was the correct form looks like
this:

sed -n 's/\(.*\)\\\(.*\)\.idx/'"\"$compress_exe\" $compress_opts
\"$target_dir\\"'\2'".$compress_ext"'\" \"\1\\2.*\"/p' $1

Broken up:
sed -n 's/\(.*\)\\\(.*\)\.idx/'
"\"$compress_exe\" $compress_opts \"$target_dir\\"
'\2'
".$compress_ext"
'\" \"\1\\2.*\"/p' $1

Basic input is:
compress_exe='c:\\Program Files\\7-Zip\\7z.exe'
compress_ext='7z'
compress_opts='a -mx=7 -ms=on -wD:\\'
target_dir='D:\\archive'

filelist.txt ($1):
D:\data\20100123.idx (the directory contains several other files with
the same path/name but with different extensions)
...

Desired output is:
"c:\Program Files\7-Zip\7z.exe" a -mx=7 -ms=on -wD:\
"D:\archive\20100123.7z" "D:\data\20100123.*"

Not too complicated but it doesn't work.  I tried different quotes and
escape orders but always end up with:
"c:\Program Files\7-Zip\7z.exe" a -mx=7 -ms=on -wD:\ "D:\archive\2.7z"
"D:\data\20100123.*"

For some reason it seems that sed gets confused by the double-quoted
escaped backslash and the following single-quoted back reference \2.
Either it prints nothing or uses the back reference index literally.
The only way I could get it to work was to move the backslash into the
variable instead:

target_dir='D:\\archive\\'

sed -n 's/\(.*\)\\\(.*\)\.idx/'"\"$compress_exe\" $compress_opts
\"$target_dir"'\2'".$compress_ext"'\" \"\1\2.*\"/p' $1

This worked.  Is there something fundamental I'm overlooking with the
earlier version or is it a bug?



More information about the mdlug mailing list