PostgreSQL | Converting tsearch2 to 8.3
One of the major features in the upcoming 8.3 release is the integration of
the tsearch2
full text search extension as a
core PostgreSQL feature.
While there are no fundamental changes, there are some differences which make upgrading from an
existing installation a little tricky. The following are my notes from upgrading a test version of the database
which powers this website.
First off, to make administration easier, I've always kept the tsearch2
tables, functions etc. in their own schema (cunningly named tsearch
) - this makes
the job of separating the legacy tsearch2
elements from the rest of the database
a lot easier.
-
DML dump: Using 8.3's
pg_dump
, extract the DML except for thetsearch
schema:~/pg83b2/bin/pg_dump -p5431 -U postgres -N tsearch -s cms > /home/ian/outgoing/cms-dml.sql
(The-N
option, introduced in PostgreSQL 8.1, specifies schemas to exclude).
-
Schema modification: remove any references to the
tsearch
schema; change index definitions fromGIST
toGIN
. Note that the names oftsearch2
functions have changed, and are now prefixed withts_
, e.g.headline()
is nowts_headline()
. Any backend functions which use these will have to be modified as well. Import the modified schema into the new 8.3 database.
-
Data dump: as with the DML, excluding the
tsearch
schema:~/pg83b2/bin/pg_dump -p5431 -U postgres -N tsearch -a cms -Fc > /home/ian/outgoing/cms-data.pg
Restore will probably require the--disable-triggers
option:~/pg83b2/bin/pg_restore --disable-triggers -U postgres -Fc -d cms -p5431 cms-data.pg
-
Code modification:
As mentioned above,
tsearch2
function names have changed; any which are used in application code will need to be changed too. (A good argument for encapsulating this kind of stuff into server-side functions, but that's life with "legacy" software... Fortunately the application only uses a few standardized full text search queries, neatly contained in their own library, so modifying them was fairly simple).
And voila, it works - at least more-or-less. What's still missing here are any configuration
changes to the original database's tsearch
schema, e.g. the
pg_ts_config
table. (And having done all this, no doubt someone will come along
and tell me about the easier way...)
(Added) some useful links:
- Tsearch2 8.3 changes from Oleg Bartunov, including list of renamed functions
- Thread on upgrading issues from the pgsql-hackers list: full text search in 8.3