Results 1 to 1 of 1

Thread: Converting vBulletin to UTF-8

  1. #1
    AdministratorBoulder UserEvent ContributorHackerJournalist
    Kyohack's Avatar
    Join Date
    Mar 2011
    Hammond, LA
    4 Post(s)

    Converting vBulletin to UTF-8

    As some of you may know, our main site is integrated with vBulletin. It came to my attention that vBulletin's ISO-8859-1 encoding clashed with our site's UTF-8 encoding. vBulletin actually overrides our site's encoding, which causes accented é's and Japanese text to display incorrectly. So to fix this problem, I needed to convert our forum to use UTF-8 encoding to match our site. Using the MegaZ's guide as a reference, there were some slight differences in the process that I used to convert vBulletin to UTF-8. Here's what I did to fix the problem on our 4.2 vBulletin forum running on a Windows server with Apache, PHP, and MySQL:

    • Make a raw backup of the forum database by copying the contents of C:\ProgramData\MySQL\MySQL Server 5.5\data\forum to the desktop.
    • Make a second backup of the forum database by dumping it from commandline: "C:\Program Files\MySQL\MySQL Server 5.5\bin\mysqldump.exe" -u root -p"password" --max_allowed_packet=500M forum >C:\Users\Administrator\Desktop\encoding\forum_ful l.sql
    • Use the --no_data parameter to dump the database structures only, to determine which tables use BLOB to store binary data: "C:\Program Files\MySQL\MySQL Server 5.5\bin\mysqldump.exe" -u root -p"password" --max_allowed_packet=500M --no_data forum >C:\Users\Administrator\Desktop\encoding\forum_str ucture.sql
    • Dump the database tables that contain text and exclude tables that have binary data in them: "C:\Program Files\MySQL\MySQL Server 5.5\bin\mysqldump.exe" -u root -p"password" --max_allowed_packet=500M --opt --default-character-set=latin1 --ignore-table=forum.vb4_block --ignore-table=forum.vb4_customavatar --ignore-table=forum.vb4_customprofilepic --ignore-table=forum.vb4_filedata --ignore-table=forum.vb4_searchlog --ignore-table=forum.vb4_sigpic --ignore-table=forum.vb4_socialgroupicon --ignore-table=forum.vb4_stylevar --ignore-table=forum.vb4_stylevardfn forum >C:\Users\Administrator\Desktop\encoding\forum.s ql
    • Make a separate dump of the tables that have binary data: "C:\Program Files\MySQL\MySQL Server 5.5\bin\mysqldump.exe" -u root -p"password" --max_allowed_packet=500M --opt --default-character-set=latin1 forum vb4_block vb4_customavatar vb4_customprofilepic vb4_filedata vb4_searchlog vb4_sigpic vb4_socialgroupicon vb4_stylevar vb4_stylevardfn >C:\Users\Administrator\Desktop\encoding\forum_ima ges.sql
    • Convert the dump that has only text, from ISO-8859-1 encoding into UTF-8 encoding using the iconv program: "C:\Program Files (x86)\GnuWin32\bin\iconv.exe" -c -f ISO-8859-1 -t utf-8 C:\Users\Administrator\Desktop\encoding\forum.sql > C:\Users\Administrator\Desktop\encoding\forum_utf8 .sql
    • Open forum_utf8.sql in a text editor and replace "CHARSET=latin1" with "CHARSET=utf8" and then replace "CHARACTER SET latin1 COLLATE latin1_bin" with "CHARACTER SET utf8 COLLATE utf8_bin".
    • Save the forum_utf8.sql file that was edited.
    • Login to SQL, and run "DROP database forum; CREATE DATABASE forum;"
    • Now put the new UTF-8 encoded database back into SQL: "C:\Program Files\MySQL\MySQL Server 5.5\bin\mysql.exe" -u root -p"password" --max_allowed_packet=500M forum < C:\Users\Administrator\Desktop\encoding\forum_utf8 .sql
    • Add your original binary data that you dumped: "C:\Program Files\MySQL\MySQL Server 5.5\bin\mysql.exe" -u root -p"password" --max_allowed_packet=500M forum < C:\Users\Administrator\Desktop\encoding\forum_imag es.sql
    • Login to vBulletin's admincp and change the forum's encoding to utf-8 in Languages & Phrases -> Language Manager.
    • Run the vBulletin upgrade script:

    • Due to the large size of our database, I needed to specify the --max_allowed_packet=500M parameter in order to successfully perform a dump. You may need to raise or lower this limit to correspond with your database's size. For example, if your database is 573 megabytes in size, then you will want to use --max_allowed_packet=600M
    • Depending on the source encoding of your database, you will need to adjust the encoding conversion statement. You can find the encoding of your database by examining the encoding mentioned within the database structure dump, or by making note of the existing encoding in vBulletin's admincp -> Languages & Phrases -> Language Manager.
    • The tables that you will need to exclude from the database text dump, and include in the binary dump will vary across forums. You will need to examine the database structure dump in order to determine which database tables make use of BLOBs. If a table has a BLOB structured column, then it has binary data in it.
    • If you're like me, and you thought that you could solve your forum's encoding problems by simply changing vBulletin's encoding to utf-8 in Languages & Phrases -> Language Manager, and then manually fixing any broken accented é's, then you're wrong. When you follow the steps above, anything that you manually fixed will be broken, so now you'll need to manually fix them again. If you happen to know your way around a raw database dump and know how to use a hex editor, then I would strongly recommend searching a raw dump of the database text for the hex signature of your broken characters. This will help you determine how many you still need to fix manually, and it will tell you their general location. Example: é (C3 83 C2 A9) should be changed to é (C3 A9). If a result is found in a user's post data, then scroll up to find the post ID. If you find it in a PM, scroll up to find the username and the PM ID. You might not care to change all of your user's posts, but text such as corrupt usernames are very important and must be fixed in order for those users to login.
    Last edited by Kyohack; 02-14-2013 at 11:31 PM.
    If you would like to support the work I've done on this site, then please consider donating:

    Check out my Online Pokesav.
    Devil's Survivor Savegame Editor

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts