SQL in Cloud

Recent Posts

  • Inconsistent database ids in SQL Azure V12
  • Taking database offline SQL DB
  • Minimize login time in SQL DB V12
  • Full-text search in SQL Azure
  • Alter database – SQL Azure V12

Recent Comments

    Archives

    • May 2016
    • December 2015
    • July 2015
    • May 2015
    • April 2015

    Categories

    • SQL Azure
    • Uncategorized

    Meta

    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org

    Full-text search in SQL Azure

    What is Full-text search in SQL?

    Full-Text Search in SQL Server lets users and applications run full-text queries against character-based data in SQL Server tables. Full-text queries perform linguistic searches against text data in full-text indexes by operating on words and phrases based on rules of a particular language such as English or Japanese. Full-text queries can include simple words and phrases or multiple forms of a word or phrase. A full-text query returns any documents that contain at least one match (also known as a hit). A match occurs when a target document contains all the terms specified in the full-text query, and meets any other search conditions, such as the distance between the matching terms.

    LIKE vs Full-text-Search

    Full-text search can perform a linguistic search whereas SQL LIKE character patterns only. Also, like predicate cannot be used to query formatted binary data unless converted into the character format. A LIKE query against millions of rows of text data can take minutes to return; whereas a full-text query can take only seconds or less against the same data, depending on the number of rows that are returned.

    For example, if you want to find words that are forms of “mouse” like mouse, mice, mouses then FT search can do that easily. Using LIKE you have to code all forms of the word & this will differ based on the language. Run the below query to see all forms of mice.

    select * from sys.dm_fts_parser(‘formsof(freetext, mice)’, 1033, 0, 0)

    The above query returns the following display terms. mice’s, mouse, mouse’s, moused, mouses, mousing,mice. That means by searching the word mice using full-text search returns all the documents that matches the above 7 words. With just like one need to issue 7 search queries

    What should I do to enable full-text search in SQL Azure?

    Absolutely nothing. SQL Azure configures the full-text search in Azure be default in all the databases irrespective of the tiers with no additional cost. The configuration parameters can’t be changed too.

    What documents types are supported?

    Full-text search in Azure supports 100+ document types and one can query the list of supported document types in Azure as:

    select * from sys.fulltext_document_types where version != ”

    Full-text search vs Azure search:

    Microsoft has both full-text search with SQL Azure and Azure search service, which one should I use would be a very common question for the application developers. This section will briefly explain what works best in what scenarios.

    If one needs both relational data and search at one location, SQL Azure full-text search is an obvious choice.

    Scenario Full-text search Azure search service
    I have relational data and needs search Consider
    I want to offload search outside my OLTP Consider
    I don’t want to change the existing Application Consider
    I don’t want to spend more money on search service Consider

    I don’t want a database Consider
    I have more than 15 concurrent queries

    Consider
    I store my files in blob storage Extra work needed Not very different than storing in SQL db and indexing
    Geo replication Supported Supported
    Scale Supported Supported

    Example:

    CREATE FULLTEXT CATALOG DocumentCatalog;

    create table DocumentRepository(DocumentID int not null, DocTextData varbinary(max), DocType char(4))

    — Create unique index on the document id
    CREATE UNIQUE INDEX ui_ukDoc ON DocumentRepository(DocumentID);

    CREATE FULLTEXT INDEX ON DocumentRepository
    (
    DocTextData –Full-text index column name
    TYPE COLUMN DocType –Name of column that contains file type information
    Language 2057 –2057 is the LCID for British English
    )
    KEY INDEX ui_ukDoc ON DocumentCatalog WITH CHANGE_TRACKING AUTO

    insert into DocumentRepository values(1, convert(varbinary(max), ‘mouse is running’), ‘txt’)
    insert into DocumentRepository values(2, convert(varbinary(max), ‘mice ran away’), ‘txt’)
    insert into DocumentRepository values(3, convert(varbinary(max), ‘This is SQL full-text introduction’), ‘txt’)

    SELECT * FROM DocumentRepository WHERE FREETEXT (DocTextData, ‘run’ );

    SELECT * FROM DocumentRepository WHERE FREETEXT (DocTextData, ‘mice’ );

    Limitations:

    Not all document types are supported (most common like pdf, fox and docx aren’t supported).
    No semantic search support
    No ability to change configuration options
    Unable to see crawl errors

    How to index docx / pdf documents?

    One can use IFilters to grab the text in the application and then store it as varbinary in SQL table as document type txt. One such example is here

    Leave a comment Cancel reply

    Your email address will not be published. Required fields are marked *

    Pages

    • Alter database in SQL V12
    • Changing MAX_DOP setting in SQL Azure DB V12
    • Connection Pooling and Isolation level reset
    • Contained users in SQL Azure DB V12
    • Full-text search in SQL Azure
    • Large Index Rebuild in SQL Azure V12
    • Row versioning in SQL Database version (V12)
    • Sql Azure db – database offline

    Archives

    • May 2016
    • December 2015
    • July 2015
    • May 2015
    • April 2015

    Categories

    • SQL Azure (8)
    • Uncategorized (3)

    WordPress

    • Log in
    • WordPress

    Subscribe

    • Entries (RSS)
    • Comments (RSS)
    • Alter database in SQL V12
    • Changing MAX_DOP setting in SQL Azure DB V12
    • Connection Pooling and Isolation level reset
    • Contained users in SQL Azure DB V12
    • Full-text search in SQL Azure
    • Large Index Rebuild in SQL Azure V12
    • Row versioning in SQL Database version (V12)
    • Sql Azure db – database offline