SQL Server

Some tips for using full-text search in SQL Server 2016

Some tips for using full-text search in SQL Server 2016


Use a full-text query instead of the LIKE Transact-SQL predicate if you need to
query
formatted binary data or query a large amount of unstructured text data.

A full-text query against millions of rows of text data can take only seconds; whereas

a LIKE query against the same data can take minutes to return.

Reduce the full-text unique key size.

To create a full-text index, the table to be indexed must have a unique index. Try to

select a numeric column as the full-text unique key to increase the speed of full-text

population. If the table to be indexed does not have numeric unique index, consider

creating numeric unique index.

Set the virtual memory size to at least 3 times the physical memory installed in

the computer, and set the SQL Server ‘max server memory’ server configuration

option to half the virtual memory size setting (1.5 times the physical memory).

Because working with full-text search is very resource expensive, you should have

enough physical and virtual memory.

Update the statistics on the clustered index or the full-text key for a full population.

Using so, you can help a multi-range population to generate good partitions on the table.

Set the ‘max full-text crawl range’ option to the number of CPUs on the server box.

Setting this option to the number of CPUs on the server box allows optimize CPU

utilization, which improves crawl performance during a full-text index crawl.

Because the ‘max full-text crawl range’ configuration option is an advanced option, you

should set the ‘show advanced option’ option to 1 to make the ‘max full-text crawl

range’ available.

Note. Setting the ‘max full-text crawl range’ option takes effect immediately without

a server restart.

Consider using the full-text property searching.

SQL Server 2016 supports property searching. Now, you can configure a full-text

index to support property-scoped searching on properties, which are emitted by IFilters.

Use an integer data type for the first column of the clustered index of the base table.

Using an integer data type for the first column of the clustered index of the base table

produces the highest full-text index population speed.

Build a secondary index on a timestamp column.

By using so, you can improve the performance of incremental population.

If you have several physical disks, place the database files separately from the

full-text catalog files.

In this case, you can improve the speed of full-text queries, because multiple disks

can process input/output requests concurrently.

Consider using search across multiple columns.

In SQL Server 2016, you can specify an arbitrary number of columns in a full-text

predicate via a column list.

If you have several physical disks, create several Pagefile.sys files, so that each

Pagefile.sys file will be placed on its own physical disk.

Spreading paging files across multiple disk drives and controllers improves

performance on most disk systems because multiple disks can process input/output

requests concurrently.

Use the top_n_by_rank parameter with CONTAINSTABLE or FREETEXTTABLE.

It can be used to restrict the number of rows returned. The top_n_by_rank parameter

specifies that only the n-highest ranked matches, in descending order, will be returned.

Try to use the CONTAINS or FREETEXT predicates instead of the CONTAINSTABLE

or FREETEXTTABLE functions to simplify the query’s text.

Because qualifying rows returned by the CONTAINSTABLE or FREETEXTTABLE

rowset functions must be explicitly joined with the rows in the original SQL Server table,

the queries that use the CONTAINSTABLE and FREETEXTTABLE functions are more

complex than those that use the CONTAINS and FREETEXT predicates.

Consider limit the size of the buffer pool before you perform a full population.

If the sqlservr.exe process tries to grab all available memory for the buffer pool,

out-of-memory conditions and failure to allocate shared memory can occur for the

fdhost.exe process. You can limit the size of the buffer pool by setting the

‘max server memory’ server option to leave enough memory for the fdhost.exe

process and operating system use.

Use full-text data definition language (DDL) statements to create, modify, and
drop
full-text catalogs and indexes.

You can use CREATE FULLTEXT CATALOG, ALTER FULLTEXT CATALOG,

DROP FULLTEXT CATALOG, CREATE FULLTEXT INDEX, ALTER FULLTEXT

INDEX and DROP FULLTEXT INDEX statements to create, modify, and drop

full-text catalogs and indexes. Because using the full-text data definition language
(DDL) statements is more efficient than using the stored procedures and the stored
procedures, which used to work with full-text catalogs and indexes, will be removed
in a future version of SQL Server, you should use full-text data definition language
(DDL) statements to create, modify, and drop full-text catalogs and indexes.

Consider using the NEAR option of the CONTAINS predicate or CONTAINSTABLE

function.

SQL Server 2016 supports the NEAR option of the CONTAINS predicate or

CONTAINSTABLE function. By using the custom NEAR option you can do the following:

– specify the maximum number of non-search terms that separate the first and last
search terms in a match

– specify that words and phrases are matched only if they occur in the order in which

you specify them.


Set the ‘awe enabled’ server configuration option to 1 if you have more than

4 gigabytes (GB) of physical memory.

Beginning in SQL Server 2008, the full-text engine can use AWE memory because

the full-text engine is part of the sqlservr.exe. Because the ‘awe enabled’ configuration

option is an advanced option, you should set the ‘show advanced option’ option to 1

to make the ‘awe enabled’ available.

Note. You must restart the SQL Server 2016 to apply changes to the ‘awe enabled’ option.


Make full-text index population during periods of low database access.

Because full-text index population takes some time, these updates should be
scheduled during CPU idle time and slow production periods.

Assign a very large table (a table that has millions of rows) to its own full-text catalog.

This can improve performance, and can be used to simplify administering and monitoring.