[mdlug] GNU sed weirdness
Raymond McLaughlin
driveray at ameritech.net
Sun Jan 24 00:48:02 EST 2010
Jeff:
While I haven't parsed exactly what you are trying to do, I do have one
comment.
The "sed s" construct can use characters other than the forward slash as
its field delimiter. Strictly speaking, sed treats the first character
after the 's' as its delimiter. In practice some characters work better
than others. If I'm parsing strings with lots of slashes in the, like
path strings generated in Windows, I use '%' as it rarely appears in
filenames. Thus:
$ cat filelist.txt | sed s%.EXE%.exe%g >>filelist.tmp
should change all instances of '.EXE' to '.exe' in an input file full of
Windows generated filenames.
Since your example contains no '%'s this should help.
Raymond McLaughlin
Jeff Hanson wrote:
> With GNU sed on Ubuntu 8.04 I'm wrote a simple script that takes in a
> Windows full-path file listing and changes it to a Windows 7-zip
> compression batch file. It uses back references get the parent
> directory path and filename. There are several files that have the
> same name but with different extensions and I just want to put all the
> ones that match by filename into the same 7z file. I wanted to use
> variables for the 7-zip path, target directory, and compression
> options. Because of spaces in the directory and file names I need to
> embed quotes. To keep the sed command from getting really ugly I
> split it up into a single-quoted search, double-quoted (because of
> variable expansions) replacement for the variables followed by
> single-quoted replacement for the back references. The back
> references \1 holds the directory path and \2 holds the filename
> without extension. What I though was the correct form looks like
> this:
>
> sed -n 's/\(.*\)\\\(.*\)\.idx/'"\"$compress_exe\" $compress_opts
> \"$target_dir\\"'\2'".$compress_ext"'\" \"\1\\2.*\"/p' $1
>
> Broken up:
> sed -n 's/\(.*\)\\\(.*\)\.idx/'
> "\"$compress_exe\" $compress_opts \"$target_dir\\"
> '\2'
> ".$compress_ext"
> '\" \"\1\\2.*\"/p' $1
>
> Basic input is:
> compress_exe='c:\\Program Files\\7-Zip\\7z.exe'
> compress_ext='7z'
> compress_opts='a -mx=7 -ms=on -wD:\\'
> target_dir='D:\\archive'
>
> filelist.txt ($1):
> D:\data\20100123.idx (the directory contains several other files with
> the same path/name but with different extensions)
> ...
>
> Desired output is:
> "c:\Program Files\7-Zip\7z.exe" a -mx=7 -ms=on -wD:\
> "D:\archive\20100123.7z" "D:\data\20100123.*"
>
> Not too complicated but it doesn't work. I tried different quotes and
> escape orders but always end up with:
> "c:\Program Files\7-Zip\7z.exe" a -mx=7 -ms=on -wD:\ "D:\archive\2.7z"
> "D:\data\20100123.*"
>
> For some reason it seems that sed gets confused by the double-quoted
> escaped backslash and the following single-quoted back reference \2.
> Either it prints nothing or uses the back reference index literally.
> The only way I could get it to work was to move the backslash into the
> variable instead:
>
> target_dir='D:\\archive\\'
>
> sed -n 's/\(.*\)\\\(.*\)\.idx/'"\"$compress_exe\" $compress_opts
> \"$target_dir"'\2'".$compress_ext"'\" \"\1\2.*\"/p' $1
>
> This worked. Is there something fundamental I'm overlooking with the
> earlier version or is it a bug?
> _______________________________________________
> mdlug mailing list
> mdlug at mdlug.org
> http://mdlug.org/mailman/listinfo/mdlug
>
More information about the mdlug
mailing list