You need to add Emoji storage support to your Rails application, but the DB either rejects the data or mangles the Emoji.
The issue lies in how Emoji are stored as characters: they are extended Unicode chars that go beyond the scope of UTF8 encoding. Normal UTF8 chars are stored in 3 bytes, whereas extended Unicode chars require 4 bytes. By default, your MySQL DB will probably be UTF8 which is perfectly fine in most cases. However, to store Emoji and other extended Unicode chars you’ll need to create your DB in ‘utf8mb4’. Unfortunately, this is not nearly as straightforward as it should be. The change from 3-bytes to 4-bytes is not correctly calculated for in Rails, and it will end up trying to create indexes for VARCHAR columns that are too long for the storage engine. You’ll see something like:
Mysql2::Error: Specified key was too long; max key length is 767 bytes: CREATE UNIQUE INDEX 'unique_schema_migrations' ON 'schema_migrations' ('version')
Caveats
- You need MySQL >= 5.5
- You may need Rails >= 4. This has not been tested on Rails 3.x.
- This is assuming you can destroy and recreate your DB. If you need to upgrade a live database, you’ll have to look up instructions on how to do so and be aware that there’s no guarantee it’ll be completely data-safe.
Steps
Override the default VARCHAR index length setting
Create an initializer and paste the following into it:
#/config/initializers/utf8mb4.rbrequire 'active_record/connection_adapters/abstract_mysql_adapter'module ActiveRecord module ConnectionAdapters class AbstractMysqlAdapter NATIVE_DATABASE_TYPES[:string] = { :name => "varchar", :limit => 191 } end endend
Update your database.yml to specify utf8mb4 encoding:
#database.yml integration: #... adapter: mysql2 encoding: utf8mb4 charset: utf8mb4 collation: utf8mb4_unicode_ci
Make sure to replace the existing UTF8 specification, if present.
Recreate your DB
bundle exec rake db:drop db:create db:migrate db:seed RAILS_ENV=<ENV>
If you can’t recreate your database you can use the command:
ALTER TABLE `table_name` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci, MODIFY column_name TEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;