Sei sulla pagina 1di 6

10/17/2019 PostgreSQL- Setting up Full Text Search | Hevo Blog

PostgreSQL- Setting up Full Text Search

By Sarad in Tutorial

Introduction:

Setting up Full-Text Search (FTS) in PostgreSQL helps users bring up a fully


edged search engine without a steep learning curve and new softwares to
install. It can be used to set up anything from a small catalog website to a huge
news website.

You might wonder what’s the big deal with Full-Text Search, you can use LIKE or
ILIKE clause any day in a query to perform a search on any given string. The
problem with traditional LIKE clause is that it has a very high-performance
penalty. That means, it takes a long time to return results as the words inside the
text are not indexed. Also, the results won’t be as expected, for e.g. you might
search rectify in a document which contains recti es – while ideally the
document should show up in the search result but it doesn’t.
Prerequisites:

https://hevodata.com/blog/postgresql-full-text-search-setup/ 1/6
10/17/2019 PostgreSQL- Setting up Full Text Search | Hevo Blog

Before getting started you need to have a working copy of PostgreSQL installed
on your favorite OS.

Step 1: Preparing the Database

We will rst create a table

CREATE TABLE tv_series (


id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
description TEXT NOT NULL,
creator TEXT NOT NULL
);

Next, we will insert some data to start querying

INSERT INTO tv_series


(id, title, description, creator)
 VALUES
(1, 'Stranger Things', 'Set in the fictional town
 (2, 'Breaking Bad', 'BB is an American neo-western
(3, 'Band of Brothers', 'Band of Brothers is a 200
 (4, 'Rick and Morty', 'Rick and Morty is an Americ

Step 2: Exploring Data

We will now have to create a document, which is simply just concatenating all the
elds we want to perform a search on.

SELECT
(title || ' ' || description || ' ' || creator) AS
FROM tv_series;

At this stage we just have a string, to make it more actionable we need to


transform it using to_tsvector().

SELECT
(to_tsvector(title) || to_tsvector(description) ||
FROM tv_series;

https://hevodata.com/blog/postgresql-full-text-search-setup/ 2/6
10/17/2019 PostgreSQL- Setting up Full Text Search | Hevo Blog

The result we have got in return is in tsvector form, this is more suitable to
perform a full-text search. But if you notice, the result looks weirder than
anything we originally had. If you’d notice, you will nd that rst record in the
result has, in fact, reduced the word investigation to investig, also there are some
random numbers in front of each word in the result. What really happened is
that Postgres just sorted a list of distinct lexemes, which are words that have
been normalized to merge di erent variants of the same word.

Let’s start querying the data. We will be using @@ operator to perform a query
and check if a tsvector matches tsquery

SELECT to_tsvector(description) @@ 'rick' FROM tv_seri


?column?
----------
t
(1 row)

Yes, it did match. Now let’s try another query.


SELECT to_tsvector(description) @@ 'morty' FROM tv_ser
 ?column?
----------
 f
(1 row)

Sadly, the result is negative. To solve this we need to build a tsquery which
creates the same lexemes and using the operator @@, casts the string into a
tsquery. The following shows the di erence between casting and using the
function to_tsquery().

SELECT 'morty'::tsquery, to_tsquery('morty');


tsquery | to_tsquery
---------+------------
'morty' | 'morti'
(1 row)

But in the case of ‘rick’ the stem is equal to the word.

SELECT 'rick'::tsquery, to_tsquery('rick');


tsquery | to_tsquery
---------+------------

https://hevodata.com/blog/postgresql-full-text-search-setup/ 3/6
10/17/2019 PostgreSQL- Setting up Full Text Search | Hevo Blog

'rick' | 'rick'
(1 row)

So we will have to use to_tsquery() for querying documents.

SELECT to_tsvector(description) @@ to_tsquery('morty')


?column?
----------
t
(1 row)

Step 3: Improving Performance of Full-Text Search

In a production environment we can’t create tsvector at runtime and expect it to


run at lightning speed, we will have to pre create it whenever a new entry is
inserted in our table. For this, we will create a new column in our table named
document_vectors of data type tsvector. Also, we would require indexes to
perform queries at speeds higher than that of light. When it comes to
 tsvector,  Postgres provides us with GiST (Generalized Search Tree) and GIN
(Generalized Inverted Index) indexes. The main di erence between them is, GiST
 is faster to build but requires additional reads, while on the other hand GIN is
slower to build but faster to query when compared to GiST. So if your database is
 read intensive you must go with GIN.

ALTER TABLE tv_series ADD "document_vectors" tsvector;


CREATE INDEX idx_fts_doc_vec ON tv_series USING gin(do

Next update the tsvector of existing rows

UPDATE
tv_series
SET
document_vectors = (to_tsvector(title) || to_tsvec

For more fun we will insert a new row in the column this time with tsvector of
document.

INSERT INTO tv_series


(id, title, description, creator, document_vectors

https://hevodata.com/blog/postgresql-full-text-search-setup/ 4/6
10/17/2019 PostgreSQL- Setting up Full Text Search | Hevo Blog

VALUES
(5, 'Better Call Saul', 'Better Call Saul is an Am

Finally, let’s search for my favorite series Breaking Bad.

SELECT id, title, creator FROM tv_series WHERE documen


id | title | creator
----+------------------+-----------------------------
2 | Breaking Bad | Vince Gilligan
5 | Better Call Saul | Vince Gilligan, Peter Gould
(2 rows)

We were able to successfully query all the documents which had the word
Breaking Bad in it with just the word ‘break’.

For more such posts, visit our blog page.


Database PostgreSQL

Move Data from any Source to


Warehouse in Real-time
Sign up today to get $500 Free Credits to try Hevo!

START FREE TRIAL

https://hevodata.com/blog/postgresql-full-text-search-setup/ 5/6
10/17/2019 PostgreSQL- Setting up Full Text Search | Hevo Blog

PLATFORM RESOURCES COMPANY FIND US ONLINE

Platform Blog Contact Us

Integration Documentation Careers

Pricing Team

Request a Demo Privacy

© Hevo Data Inc. 2019. All Rights Reserved.


https://hevodata.com/blog/postgresql-full-text-search-setup/ 6/6

Potrebbero piacerti anche