[mdlug] MySQL Upgrade - latin1 -> utf8??? SOLVED!

Wojtak, Greg GregWojtak at quickenloans.com
Wed Apr 30 13:53:58 EDT 2008


-----Original Message-----
From: mdlug-bounces at mdlug.org [mailto:mdlug-bounces at mdlug.org] On Behalf
Of Aaron Kulkis
Sent: Wednesday, April 30, 2008 1:50 PM
To: MDLUG's Main discussion list
Subject: Re: [mdlug] MySQL Upgrade - latin1 -> utf8??? SOLVED!

Wojtak, Greg wrote:
> -----Original Message-----
> From: mdlug-bounces at mdlug.org [mailto:mdlug-bounces at mdlug.org] On
Behalf
> Of Aaron Kulkis
> Sent: Tuesday, April 29, 2008 8:22 AM
> To: MDLUG's Main discussion list
> Subject: Re: [mdlug] MySQL Upgrade - latin1 -> utf8???
> 
>> Wojtak, Greg wrote:
>>> I am trying to migrate a bunch of MySQL databases from an old, old
>>> Gentoo server running MySQL 4.0.22 to a RHEL server running MySQL
>>> 5.0.22.  In the process, we are also migrating from latin1 character
> set
>>> to utf8.  The database I am working with now is a mediawiki db, and
> when
>>> I do the standard mysqldump, run iconv on the resulting file, and
> then
>>> import the database, some of the pages in the wiki won't load until
I
> do
>>> an edit and save.   Even when I do that, single quotes and double
> dashes
>>> (I presume they are "smart" quotes and one of the longer dashes like
>>> what gets generated by Word or OO Writer when you type --) show up
as
>>> "funky characters."  The database cannot be dumped as utf8 as 4.0
> does
>>> not support it.  
>>>
>>> At this point, I don't know if it is a problem with the database
> import
>>> process or something with mediawiki.  Does anyone have any ideas on
> any
>>> anything else I can try?
>> Work on just dumping and importing the data first.
>>
>> Once you have done that successfully, THEN work on doing the
>> latin-1 => UTF-8 conversion.
> 
> 
> It turns out the "special" characters that were not showing up were
> Microsoft Word's "smart quotes" and the elongated hyphen that Word
turns
> a double dash (--) into.  The solution I found worked was to do the
> dump, open it up in EditPlus32 (a Windows based text editor) in UTF8
> mode, then save it out in ANSI format and send it back to the server
for
> re-importation (it is too a word, I just invented it).  iconv was too
> aware of those special characters to fix them, at least the way I
> thought it would, but it turns out EditPlus and Windows are just dumb
> enough to get me what I wanted.  :)

man sed

Tried that, of course.  Couldn't figure out the appropriate escape
sequences for the characters.  By the time I did, I could have had the
database converted and loaded up anyway.



More information about the mdlug mailing list