# UTF-8
# Input
$string = $_REQUEST['user_comment'];
if (!mb_check_encoding($string, 'UTF-8')) {
// the string is not UTF-8, so re-encode it.
$actualEncoding = mb_detect_encoding($string);
$string = mb_convert_encoding($string, 'UTF-8', $actualEncoding);
}
# Output
header('Content-Type: text/html; charset=utf-8');
-
HTML5
<meta charset="utf-8">
-
Older versions of HTML
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
- Specify the `utf8mb4` character set on all tables and text columns in your database. This makes MySQL physically store and retrieve values encoded natively in UTF-8.
MySQL will implicitly use `utf8mb4` encoding if a `utf8mb4_*` collation is specified (without any explicit character set).
- Older versions of MySQL (< 5.5.3) do not support `utf8mb4` so you'll be forced to use `utf8`, which only supports a subset of Unicode characters.
- In your application code (e.g. PHP), in whatever DB access method you use, you'll need to set the connection charset to `utf8mb4`. This way, MySQL does no conversion from its native UTF-8 when it hands data off to your application and vice versa.
-
Some drivers provide their own mechanism for configuring the connection character set, which both updates its own internal state and informs MySQL of the encoding to be used on the connection. This is usually the preferred approach.
For Example (The same consideration regarding `utf8mb4`/`utf8` applies as above):
-
If you're using the [PDO](http://www.php.net/manual/en/book.pdo.php) abstraction layer with PHP ≥ 5.3.6, you can specify `charset` in the [DSN](http://www.php.net/manual/en/ref.pdo-mysql.connection.php):
$handle = new PDO('mysql:charset=utf8mb4');
-
If you're using [mysqli](http://www.php.net/manual/en/book.mysqli.php), you can call [`set_charset()`](http://www.php.net/manual/en/mysqli.set-charset.php):
$conn = mysqli_connect('localhost', 'my_user', 'my_password', 'my_db'); $conn->set_charset('utf8mb4'); // object oriented style mysqli_set_charset($conn, 'utf8mb4'); // procedural style
-
If you're stuck with plain [mysql](http://www.php.net/manual/en/book.mysql.php) but happen to be running PHP ≥ 5.2.3, you can call [`mysql_set_charset`](http://www.php.net/manual/en/function.mysql-set-charset.php).
$conn = mysql_connect('localhost', 'my_user', 'my_password'); $conn->set_charset('utf8mb4'); // object oriented style mysql_set_charset($conn, 'utf8mb4'); // procedural style
- If the database driver does not provide its own mechanism for setting the connection character set, you may have to issue a query to tell MySQL how your application expects data on the connection to be encoded: [`SET NAMES 'utf8mb4'`](http://dev.mysql.com/doc/en/charset-connection.html).
- You need to make sure that every time you process a UTF-8 string, you do so safely. This is, unfortunately, the hard part. You'll probably want to make extensive use of PHP's [`mbstring`](http://www.php.net/manual/en/book.mbstring.php) extension.
- **PHP's built-in string operations are **not** by default UTF-8 safe.** There are some things you can safely do with normal PHP string operations (like concatenation), but for most things you should use the equivalent [`mbstring`](http://www.php.net/manual/en/book.mbstring.php) function.
# Remarks
-
If you're using the [PDO](http://www.php.net/manual/en/book.pdo.php) abstraction layer with PHP ≥ 5.3.6, you can specify `charset` in the [DSN](http://www.php.net/manual/en/ref.pdo-mysql.connection.php):
# Data Storage and Access
This topic specifically talks about UTF-8 and considerations for using it with a database. If you want more information about using databases in PHP then checkout this topic (opens new window).
Storing Data in a MySQL Database:
Accessing Data in a MySQL Database: