[mdlug] GNU sed weirdness

Raymond McLaughlin driveray at ameritech.net
Sun Jan 24 01:08:08 EST 2010


Looking again, it might not be much help. Sorry.

Raymond McLaughlin wrote:
> Jeff:
> While I haven't parsed exactly what you are trying to do, I do have one
> comment.
> 
> The "sed s" construct can use characters other than the forward slash as
> its field delimiter. Strictly speaking, sed treats the first character
> after the 's' as its delimiter. In practice some characters work better
> than others. If I'm parsing strings with lots of slashes in the, like
> path strings generated in Windows, I use '%' as it rarely appears in
> filenames. Thus:
> 
> $ cat filelist.txt | sed s%.EXE%.exe%g >>filelist.tmp
> 
> should change all instances of '.EXE' to '.exe' in an input file full of
> Windows generated filenames.
> 
> Since your example contains no '%'s this should help.
> 
> Raymond McLaughlin
> 
> Jeff Hanson wrote:
>> With GNU sed on Ubuntu 8.04 I'm wrote a simple script that takes in a
>> Windows full-path file listing and changes it to a Windows 7-zip
>> compression batch file.  It uses back references get the parent
>> directory path and filename.  There are several files that have the
>> same name but with different extensions and I just want to put all the
>> ones that match by filename into the same 7z file.  I wanted to use
>> variables for the 7-zip path, target directory, and compression
>> options.  Because of spaces in the directory and file names I need to
>> embed quotes.  To keep the sed command from getting really ugly I
>> split it up into a single-quoted search, double-quoted (because of
>> variable expansions) replacement for the variables followed by
>> single-quoted replacement for the back references.  The back
>> references \1 holds the directory path and \2 holds the filename
>> without extension.  What I though was the correct form looks like
>> this:
>>
>> sed -n 's/\(.*\)\\\(.*\)\.idx/'"\"$compress_exe\" $compress_opts
>> \"$target_dir\\"'\2'".$compress_ext"'\" \"\1\\2.*\"/p' $1
>>
>> Broken up:
>> sed -n 's/\(.*\)\\\(.*\)\.idx/'
>> "\"$compress_exe\" $compress_opts \"$target_dir\\"
>> '\2'
>> ".$compress_ext"
>> '\" \"\1\\2.*\"/p' $1
>>
>> Basic input is:
>> compress_exe='c:\\Program Files\\7-Zip\\7z.exe'
>> compress_ext='7z'
>> compress_opts='a -mx=7 -ms=on -wD:\\'
>> target_dir='D:\\archive'
>>
>> filelist.txt ($1):
>> D:\data\20100123.idx (the directory contains several other files with
>> the same path/name but with different extensions)
>> ...
>>
>> Desired output is:
>> "c:\Program Files\7-Zip\7z.exe" a -mx=7 -ms=on -wD:\
>> "D:\archive\20100123.7z" "D:\data\20100123.*"
>>
>> Not too complicated but it doesn't work.  I tried different quotes and
>> escape orders but always end up with:
>> "c:\Program Files\7-Zip\7z.exe" a -mx=7 -ms=on -wD:\ "D:\archive\2.7z"
>> "D:\data\20100123.*"
>>
>> For some reason it seems that sed gets confused by the double-quoted
>> escaped backslash and the following single-quoted back reference \2.
>> Either it prints nothing or uses the back reference index literally.
>> The only way I could get it to work was to move the backslash into the
>> variable instead:
>>
>> target_dir='D:\\archive\\'
>>
>> sed -n 's/\(.*\)\\\(.*\)\.idx/'"\"$compress_exe\" $compress_opts
>> \"$target_dir"'\2'".$compress_ext"'\" \"\1\2.*\"/p' $1
>>
>> This worked.  Is there something fundamental I'm overlooking with the
>> earlier version or is it a bug?
>> _______________________________________________
>> mdlug mailing list
>> mdlug at mdlug.org
>> http://mdlug.org/mailman/listinfo/mdlug
>>
> 
> _______________________________________________
> mdlug mailing list
> mdlug at mdlug.org
> http://mdlug.org/mailman/listinfo/mdlug
> 




More information about the mdlug mailing list