Recently, at eet.nu we have been working on migrating a Ruby on Rails application from MySQL to PostgreSQL. Depending on the complexity of an app, there might be many caveats during migration. Here are some notes on issues that we solved during the migration.
This article assumes that we work with:
- Ruby on Rails 4;
- MySQL 5.6;
- PostgreSQL 9.4.
Quoting and Boolean fields
MySQL uses backticks (`) for quoting, while in PostgreSQL double quotes (") are being used.
MySQL uses TINYINT datatype to represent boolean fields. Therefore, 1 would represent true and 0 would represent false. PostgreSQL has native boolean datatype.
Instead of writing where(visible: 1)
in MySQL, in PostgreSQL it would be where(visible: true)
.
Sorting with ’nil'
If we order by a column, which contains nil (or NULL in SQL) values, then we might get different results in MySQL and PostgreSQL.
In MySQL NULL values will be put last if ORDER BY ... DESC
is used.
In PostgreSQL NULL values are assumed to be larger than any non-null value. Therefore, NULL values will be put first if ORDER BY ... DESC
is used.
Some possible solutions are:
-
(PostgreSQL specific) Use
NULLS LAST
orNULLS FIRST
options:order("average_rating NULLS LAST")
-
Add a minus sign before column name and revert sorting order(e.g., from DESC to ASC):
order("-average_rating ASC")
Usage of prepared and unprepared statements.
Since Rails 3.1 prepared statements are enabled by default. Currently, gem mysql2 does not support prepared statements (see this pull request) while gem pg supports them. It means, when using MySQL with rails method to_sql
it will return a SQL query without bind values, but if PostgreSQL is being used then to_sql
method would return a SQL query that requires bind values (e.g., "user_id" = $1
).
In order to use unprepared statements in Rails 4 with PostgreSQL they should be wrapped into Model.connection.unprepared_statement { query.to_sql }
Division by zero
In MySQL there is a special mode that changes the behavior of handling division by zero - ERROR_FOR_DIVISION_BY_ZERO. If this mode is disabled, then MySQL will silently insert NULL.
PostgreSQL is stricter and it will produce ERROR: division by zero
. The function NULLIF(value1, value2)
can be used to replace zero values with NULL.
Functions
Some built-in functions have different implementations or not present in PostgreSQL.
UNIX_TIMESTAMP()
In MySQL the function UNIX_TIMESTAMP()
returns a Unix timestamp for the current time. The equivalent in PostgreSQL is: extract(epoch from now())
FIELD()
The function FIELD(str, str1, str2,...)
in MySQL returns a position of str in str1, str2,… (imagine that it is an array). Unfortunately, there is no such function in PostgreSQL. For arrays that contain only integer values, we can use function idx
from intarray
module (documentation).
See this gist for more suggestions on how to implement similar behavior in PostgreSQL.
Datatypes migration
There might be a need to convert from one datatype to another during migration. For example, a column should have a bigint datatype instead of text. Since in PostgreSQL there is no implicit conversion from text to int, we need to use ALTER TABLE
(PostgreSQL documentation) command or using Rails migrations:
change_column :users, :uid, 'bigint USING CAST(uid as bigint)'
Default sort order
There is no default order of records returned from a database if ORDER BY
is not specified.
From PostgreSQL documentation:
If sorting is not chosen, the rows will be returned in an unspecified order.
Hence, if some tests started to fail because order of records is not the same as expected, then it is a good idea to explicitly specify sorting order.
Rounding
Both database management systems have ROUND()
function to round numbers. By default, they might give different results.
mysql> select round(55/10);
+----------------+
| round(55/10) |
+----------------+
| 6 |
+----------------+
and in PostgreSQL:
psql=# select round(55/10);
round
-------
5
It seems that in PostgreSQL by default function ROUND()
assumes that result of 55/10 is of datatype double precision, which is not stored as exact value. In order to get expected result from ROUND()
we should pass it at least one numeric value, e.g.:
psql=# select round(55::numeric/10);
round
-------
6
Data migration
There are different solutions on to how migrate existing data from MySQL to PostgreSQL, but pgloader worked best for us. It is straightforward to use and it is relatively fast.
Interesting reading
Comparison of different SQL implementations (not very fresh but still interesting).