Solved the Telugu character display problem

Browsed quite a bit on UTF8 and PHP. There is a lot of stuff on it. Apparently PHP support is not very good for UTF8 i.e. looks like you need to do some additional programming for it. Was quite a pain as a lot of suggestions did not work.

Finally this link gave me the solution for a PHP only page using mysqli. All I needed was a:
mysqli_set_charset($con, “utf8”);
where $con is the link identifier returned by mysqli_connect().
Now the Telugu characters show up properly in the PHP page.

PHP page sourcecode:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "">

<html xmlns="">
header('Content-type: text/html; charset=UTF-8') ;
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>Untitled Document</title>


echo "Errorinall<br>";
mysqli_set_charset($con, "utf8");
$result = mysqli_query($con,"SELECT * FROM telugu2english");

echo "<br><table border='1'>
<th>Telugu sentence in Telugu script</th>
<th>English sentence in English script</th>
<th> English sentence audio </th>

while($row = mysqli_fetch_array($result))
echo "<tr>";
$a = $row['telugu'];
//   $a = "అ";
// echo "<td>" . $row['id'] . "</td>";
//  echo "<td>" . $row['telugu'] . "</td>";
echo "<td>" . $a . "</td>";
echo "<td>" . $row['english'] . "</td>";
echo "<td>" . $row['english_audio_path'] . "</td>";
echo "</tr>";
echo "</table>";

Need to now look at the CodeIgniter framework based code and see how to fix that. has some useful info.

Update on 21st Sept. 2011:

Had a weird experience! Initially the code as given in was displaying junk characters. I played around with it by changing it to what was being used in the above mentioned PHP only page (or equivalent in CI framework). If I recall correctly I did the following:

  • In t2e_view.php added HTML tags including the meta tag where charset got specified as UTF-8
  • Included this php code in t2e_view.php: header('Content-Type: text/html; charset=utf-8');
  • Included this ci code in t2e_model.php: $db['default']['char_set'] = "utf8";

Then the CI program showed Telugu characters properly! Problem solved!

But I wanted to know which directive did the trick. So I deleted all the above three changes. Rebooted the system so that any transient database settings are forgotten. And then tried again expecting to see junk characters. But I saw Telugu characters!

Did the last bullet code go and change the database default char_set permanently (not for just the program run/php page execution)? Well, phpmyadmin shows collation as latin1_swedish_ci for the two tables that ci database have (including telugu2english table). Of course, column telugu of telugu2english table is shown as having collation utf8_general_ci. So no changes from what it was before.

Right now things work. But my knowledge of underlying utf-8 stuff in both mySQL and CI/PHP database access functions/classes is poor. Will need to understand it properly somewhere down the line.

This entry was posted in Spoken English App. Blog. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s