[mdlug] Firefox bookmarks problem and solution

Raymond McLaughlin driveray at ameritech.net
Sat Aug 18 07:57:36 EDT 2007


MDLUGers,
The list has been kind of slow, so I hope you don't mind me sharing a
useful little hack I worked up the other day. It solves a problem that
has irked me since the early mozilla days.

*The Problem*
This involves the manageability the browser's bookmark files across
multiple systems. This file, typically
~/.mozilla/firefox/{cryptic}.default/bookmarks.html, has not been simple
html since the late netscape or very early mozilla days. Back then it
was fairly simple to read and edit this file in any text editor. But for
quite a while now this "html" file has actually been xml. Some people
tell me they have no problem editing xml, but I use firefox's
bookmarking  capabilities rather extensively, so my bookmark files are
kind of large, about 256k. OK that's not really very large, but there
are long blocks of mime encoded binary, usually the 'favicons' images of
web sites that make it very hard to read. I would just as soon, for the
most part, let firefox manage the coding of these files, and do my
management via the "Organize Bookmarks..." interface in the Bookmarks
menu in the browser.

The problem that arises is that I use several different computers for my
browsing, and might come across a site I think worth bookmarking while
using any one of them. This invariably leads to forking among several
copies of a given set of bookmarks: Several aspects of how firefox
handles it's bookmarks files conspire to make this forking inevitable.
The first problematic factor is that mozilla (generic here for mozilla
derived browsers) reads the bookmarks.html into memory, from disk, when
you start the browser. From then on works only with the RAM copy until
it shuts down, at which point it writes what the version from RAM to
disk. Thus simply copying an updated bookmarks file, say from machine A
to machine B, will have no effect if done while the browser is running
on machine B. The browser on machine B will overwrite the imported file
with what ever it has in RAM when it exits.

I often leave firefox running on machine A, downstairs, when I go
upstairs and start and start browsing on machine B. If I bookmark a new
link on machine B, and want to also have it in my bookmarks on machine
A, I can't just scp my bookmarks file to the downstairs computer. No, I
first have to close the current browser, go down stairs and close the
browser there (or ssh in and kill all mozilla processes) and then copy
the file on machine B to overwrite the file on machine A. And any fresh
bookmarks that may be have been saved on machine A would be lost with
the overwrite.

An alternative is, after closing mozilla on machine B, to scp its
bookmark file to the firefox directory on machine A, but using a
different file name, such as bookmarks.machineb.html. But what then? The
Bookmarks Manager's built-in import tool would just suck *all* the
bookmarks from the machineb file leaving me with a huge number of
duplicates to deal with, just to get the one new one, if I can remember
which one it is.

*The Solution (such as it is)*
This is where my solution comes in. Once I seriously looked at solving
this on my own, it turned out to be borderline trivial. If this
duplicates a solution published else where I apologize. The last time I
googled this problem all I came up with was several sets of instructions
on how to locate and over-write one file with another, but no solution
to the forking problem.

In the hopes that it might be useful to some of you, here is my script,
bookdiff.

#####################################################################
#!/bin/bash
# bookdiff -- a script for reconciling forked mozilla/firefox bookmark
# files. The URLs in the two bookmark files are compared, and uniq URLs
# are sorted and written to a file bookdiff.html

if [ x$2 = x ] || [ x$3 != x ]
  then echo
     echo ' bookdiff- a script for reconciling forked mozilla/firefox \
bookmark files'
     echo ' Usage:  bookdiff [path/]bookmarkfile1 [path/]bookmarkfile2'
     echo ' BLANK SPACES IN FILE NAMES ARE *EVIL* AND NOT ALLOWED HERE!'
     echo
     exit 0
fi


#These strings must be exported because they will be used in a
#sub-shell.
export file1=$1
export file2=$2

#Declare a function to parse a list of unique URLs into a list of proper
# html anchor tags ending in with line break tags.
function parser()
{
  while read urlmark
  do
     if echo $urlmark |grep file:/// >/dev/null
        then hostmark=$urlmark
        else hostmark=$(echo $urlmark | cut -d\/ -f 3)
     fi
     if grep $urlmark $file1 >/dev/null 2>&1
       then filename=$file1
       else filename=$file2
     fi
     echo $filename:\<a href=\"$urlmark\"\>$hostmark '</a> <br>' \
>>bookmarks.sort3

  done
}

#Cut out all of the URLs from each file, and save a single instance
# of each one, from each file.
grep "HREF=" $file1 | cut -d \" -f 2 | sort| uniq >>bookmarks.sort1
grep "HREF=" $file2 | cut -d \" -f 2 | sort| uniq >>bookmarks.sort1

#This step separates out the URLs found in one, but not both files.
sort bookmarks.sort1 | uniq -u >bookmarks.sort2

parser <bookmarks.sort2

#Combine the results into a simple, proper html file.
touch bookmarks.sort3
echo '<html>'                 >bookdiff.html
grep $file1 bookmarks.sort3  >>bookdiff.html
echo '<br>'                  >>bookdiff.html
echo '<br>'                  >>bookdiff.html
grep $file2 bookmarks.sort3  >>bookdiff.html
echo '</html>'               >>bookdiff.html

#Clean up.
rm bookmarks.sort*

#####################################################################

Usage notes:
Once bookdiff has been copied to a directory on your $PATH and made
executable, the most common usage would be to compare to bookmark files
in you firefox directory. For instance:

    $ cd .mozilla/firefox/{cryptic}.default/
    $ bookdiff ./bookmarks.machineb.html  ./bookmarks.html

Another usage might be:
    $ cd .mozilla/firefox/{cryptic}.default/
    $ bookdiff /mnt/usb/bookmarks.html  ./bookmarks.html

Once either of these comparisons has been run a file 'bookdiff.html'
will appear in the current directory,
~/.mozilla/firefox/{cryptic}.default/ in this case. Browse this file in
firefox and you will see a list of all the links that are in the
"foreign" bookmark file, but not the current one, clearly labeled with
the "foreign" file's name. Below that will be a list of all the links
that are in your current bookmarks file, but not the foreign on. To
reconcile the two files you only need to be concerned with the top list.
With the browser open import the links in the top list into you running
bookmarks file. This might be a good time to make sure that the links
are still good, and the you still want them. Once you have imported the
links you want to keep close your browser and copy the resulting
bookmarks file to overwrite where ever the "foreign" file lives (as long
as it's not on a machine with the browser running).

Other considerations:
Mozilla bookmarks files contain "LAST_VISIT=" data in each link, so the
standard "diff" file utility is useless even when comparing bookmark
files containing no new links.

This bookdiff only compares sorted lists of the URLS contained in each
of the bookmark files, not the overall organization of the file. If you
compare two files with all the same URLs then the resulting
bookdiff.html file will appear blank in the browser, and View-->Page
Source will show:

     <html>
     <br>
     <br>
     </html>

The virtual folder structure of each file could still differ.

The actual string {cryptic} in the filename .../{cryptic}.default/ is
supposed to be a security secret unique to each user/installation. You
will have to deal with it in order to use bookdiff. Your bookdiff.html
file will either reside in it, may contain it, or both if you bookmark
the bookdiff.,html file itself (generally a convenience if you use it).
This should be kept in mind when handling copies of your bookmarks file.






More information about the mdlug mailing list