SQL 2008 integrated Full-Text search - VERY slow queries!

SQL 2008 integrated Full-Text search - VERY slow queries!

Post by OGG » Tue, 02 Dec 2008 00:06:21


Hi guys,
I get really slow results for full text search queries. By slow I mean
5 seconds average.
Let me describe what I did so far.
I created a simple table: id,title,description,firstTopic,longTopic.
The table contains now 3,500,000 entries.
The fields I need to index are title and description. So I created a
full text index on these 2 columns.

Next, I ran the following commnds, as I found them on some articles:

-- Set Change Tracking to Manual
ALTER FULLTEXT INDEX ON tblDirectory2 SET CHANGE_TRACKING MANUAL

-- Start a Manual Update of the FullText Catalog
ALTER FULLTEXT INDEX ON dbo.tblDirectory2 START UPDATE POPULATION

-- Rebuild the FT Index completely (may take some time)
ALTER FULLTEXT CATALOG iContentServiceCatalog
REBUILD WITH ACCENT_SENSITIVITY=OFF;

Next, I use the following query to pull data out sorted by rank:

select d.FirstTopic as Topic, COUNT(*) count from
FREETEXTTABLE(tblDirectory2,(Title,Description),'some free text
here',LANGUAGE 'English',300) r
join
tblDirectory2 d on d.Id=r.[key]
group by d.FirstTopic
order by count desc

The result takes about 5 seconds to return, and it is very slow for my
needs.
Can you guys see anything wrong ?
Any help to improve performance will be very much appreciated.

Thanks,
Itay.
 
 
 

SQL 2008 integrated Full-Text search - VERY slow queries!

Post by Hilary Cot » Tue, 02 Dec 2008 03:57:26

What does your execution plan look like?

Do you have a unique index (as opposed or in addition to your primary
key) which is the basis for the full-text index?


> LTER FULLTEXT INDEX ON dbo.tblDirectory2 START UPDATE POPULATI>N >>
> -- Rebuild the FT Index completely (may take some t>me)
> ALTER FULLTEXT CATALOG iContentServiceC>talog
> REBUILD WITH ACCENT_SENSITIV>TY>OFF;
>
> Next, I use the following query to pull data out sorted >y >ank:
>
> select d.FirstTopic as Topic, COUNT(*) cou>t rom
> FREETEXTTABLE(tblDirectory2,(Title,Description),'some f>ee text
> here',LANGUAGE 'English>,300) r> > join
> tblDirectory2 d on d.Id>r.[key]
> group by d.Fi>stTopic
> order by co>nt>desc
>
> The result takes about 5 seconds to return, and it is very slo> for my
> Can you guys see anything>wrong ?
> Any help to improve performance will be very much appr>ci>ted.
>
>>Thanks,
> Itay.

 
 
 

SQL 2008 integrated Full-Text search - VERY slow queries!

Post by OGG » Tue, 02 Dec 2008 04:36:55

Yes, I have a unique index.
I must confess I'm relatively new to full text search, this is why I
described my steps.
Eventually there would be something like 50,000 queries a day.

Thanks.
 
 
 

SQL 2008 integrated Full-Text search - VERY slow queries!

Post by SGlsYXJ5IE » Tue, 02 Dec 2008 05:08:00

Can you post the schema of the table here along with the indexes on it.

Secondly in the query pan, highlight your query and press control L
simultaneoulsy. Save the results and post them as an attachment here.
 
 
 

SQL 2008 integrated Full-Text search - VERY slow queries!

Post by OGG » Tue, 02 Dec 2008 06:49:57

Here it is:
Table:
CREATE TABLE [dbo].[tblDirectory2](
[Id] [int] IDENTITY(1,1) NOT NULL,
[Topic] [nvarchar](500) NOT NULL,
[FirstTopic] [nvarchar](50) NULL,
[Title] [nvarchar](max) NULL,
[Description] [nvarchar](max) NULL,
[Url] [nvarchar](500) NULL
) ON [PRIMARY]

Index:
CREATE UNIQUE CLUSTERED INDEX [idx_c_id] ON [dbo].[tblDirectory2]
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB
= OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

--catalog
CREATE FULLTEXT CATALOG ftCatalog AS DEFAULT;

--create full text index
CREATE FULLTEXT INDEX ON tblDirectory2(Title,Description)
KEY INDEX idx_c_id
ON ftCatalog
WITH
CHANGE_TRACKING OFF,
NO POPULATION;

--populate later on

As for attaching the execution file... I don't see the attaching
option here, so I'll put it on a temp *** link at:
http://www.yqcomputer.com/

Thank you for your help.
Itay
 
 
 

SQL 2008 integrated Full-Text search - VERY slow queries!

Post by SGlsYXJ5IE » Tue, 02 Dec 2008 13:24:01

Can you try this and let me know if there is any improvement?

CREATE UNIQUE INDEX [idx_c_idNew] ON [dbo].[tblDirectory2]
(
[Id] ASC
)include (FirstTopic)

--catalog
CREATE FULLTEXT CATALOG ftCatalog AS DEFAULT;

--create full text index
CREATE FULLTEXT INDEX ON tblDirectory2(Title,Description)
KEY INDEX idx_c_idNew
ON ftCatalog
WITH
CHANGE_TRACKING OFF,
NO POPULATION;

select d.FirstTopic as Topic, COUNT(*) count from
FREETEXTTABLE(tblDirectory2,(Title,Description),'some free text
here',LANGUAGE 'English',300) r
join
tblDirectory2 d on d.Id=r.[key]
group by d.FirstTopic
order by count desc
 
 
 

SQL 2008 integrated Full-Text search - VERY slow queries!

Post by OGG » Tue, 02 Dec 2008 15:14:14

Hi,
Did you mean to drop the clustered index? (I didn't drop it, I just
added the non clustered index as you showed).
Execution time is now 1.5 - 2 seconds. An improvement from 3 - 5
seconds.
But it is still considered a long execution time for me. I need it to
be less than a second. A lot less.

I can change the table. I don't know if it can help. For example, the
title and description columns are [nvarchar](max) where I know for
sure it wont be longer than 5000.
This is the table today:
CREATE TABLE [dbo].[tblDirectory2](
[Id] [int] IDENTITY(1,1) NOT NULL,
[Topic] [nvarchar](500) NOT NULL,
[FirstTopic] [nvarchar](50) NULL,
[Title] [nvarchar](max) NULL,
[Description] [nvarchar](max) NULL,
[Url] [nvarchar](500) NULL
) ON [PRIMARY]

I can change more things. What do you recommend ? The table will have
3,500,000 records, and the records are not deleted or updated or
inserted either. But there will be a lot of "select".

Thanks, Itay.
 
 
 

SQL 2008 integrated Full-Text search - VERY slow queries!

Post by SGlsYXJ5IE » Tue, 02 Dec 2008 21:04:04

You can leave the clustered index. This is an additional one that sql fts
will use.

Can you try this now?

CREATE UNIQUE iNDEX [idx_c_firsttopic_2] ON [dbo].[tblDirectory2]
(
firsttopic,[Id] ASC
)


declare @table table(id int, primary key (id))
insert into @table
select [key] from
FREETEXTTABLE(tblDirectory2,(Title,Description),'test',LANGUAGE 'English' )
select d.FirstTopic as Topic, COUNT(D.ID) count from tblDirectory2 d
join @table r
on d.Id=r.id
group by d.FirstTopic
order by count desc

post the new execution plan again.

Note I have removed the 300 from your freetext clause.
 
 
 

SQL 2008 integrated Full-Text search - VERY slow queries!

Post by OGG » Wed, 03 Dec 2008 00:11:21

Execution time stands on 1 to 2 seconds - without the 300 max results
limitation.
When adding the 300, I get 0.5 to 1 second, Which I hope to reduce
even more...

Execution plan is here: http://www.yqcomputer.com/

Thanks,
Itay.
 
 
 

SQL 2008 integrated Full-Text search - VERY slow queries!

Post by SGlsYXJ5IE » Wed, 03 Dec 2008 00:30:06

I'm looking at your execution plan and am seeing an estimate for 42,000 rows
being returned from your fulltext query.

This is a lot for rows. We might be able to do something funky like
preaggregate the first topic but to know the expensive of this I'll have to
get you to run set statistics io on before the main query and post the
results you get back here.

I think you need to evaluate whether freetext is the correct option. I
suspect that you should be using contains.
 
 
 

SQL 2008 integrated Full-Text search - VERY slow queries!

Post by OGG » Wed, 03 Dec 2008 02:02:02

OK, how do I get out the statistics io ?


From my understanding, the thing about Contains is that I have to pass
it a logical expression - "Hello and World and...Whatever "
As input I get a small paragraph (a line or two long).
From the results I collect the topics (and count them). The topics
count is all the really matter.
If I separate the input text with "AND" or "OR" than I can't really
ask the database to return rows with rank higher as possible,I guess
because with free text it looks for hits containing as many words as
possible, and if I use "AND" I miss the rows with fewer word. Using
"OR" will get too many results I guess.

I am willing to test whatever you suggest in order to make it work
faster :)

Thanks
Itay.
 
 
 

SQL 2008 integrated Full-Text search - VERY slow queries!

Post by OGG » Wed, 03 Dec 2008 03:29:53

If columns Title and Description where not [nvarchar](max), but
[nvarchar](5000) - would it be wise to add another index on these
columns ?
 
 
 

SQL 2008 integrated Full-Text search - VERY slow queries!

Post by SGlsYXJ5IE » Wed, 03 Dec 2008 03:47:02

can you send me a sample of what your search paragraph looks like? I meant
you should use containstable, not contains.

To get set statistics io working type it before you issue these queries, ie

set statistics io on
GO
declare @table table(id int, primary key (id))
insert into @table
select [key] from
FREETEXTTABLE(tblDirectory2,(Title,Description),'test',LANGUAGE 'English' )
select d.FirstTopic as Topic, COUNT(D.ID) count from tblDirectory2 d
join @table r
on d.Id=r.id
group by d.FirstTopic
order by count desc
 
 
 

SQL 2008 integrated Full-Text search - VERY slow queries!

Post by OGG » Wed, 03 Dec 2008 04:30:55

Sample search paragraph:
"Sources: Crennel needs miracle finish to keep job
Browns head coach Romeo Crennel stated emphatically this week that
Brady Quinn will be Cleveland's starting quarterback in 2009. Crennel
might not be Quinn's head coach next season, however."

Here is the message after executing with statistic io:
Informational: The full-text search condition contained noise word(s).
Table 'Worktable'. Scan count 1, logical reads 388252, physical reads
0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob
read-ahead reads 0.

(189519 row(s) affected)

(1 row(s) affected)

(16 row(s) affected)
Table 'tblDirectory2'. Scan count 0, logical reads 568557, physical
reads 34, read-ahead reads 0, lob logical reads 0, lob physical reads
0, lob read-ahead reads 0.
Table '#279A9293'. Scan count 1, logical reads 306, physical reads 0,
read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob
read-ahead reads 0.

(1 row(s) affected)
 
 
 

SQL 2008 integrated Full-Text search - VERY slow queries!

Post by SGlsYXJ5IE » Wed, 03 Dec 2008 13:42:02

ry this:

create table numbers(number int)
GO
declare @int int
set @int=1
while @int<8000
begin
insert into numbers(number) values(@int)
select @int=@int+1
end
GO
CREATE FUNCTION dbo.SplitString
(
@List TEXT,
@Delimiter CHAR(1)
)
RETURNS @ReturnTbl TABLE (OutParam VARCHAR(20))
WITH SCHEMABINDING
AS
--from http://www.sqljunkies.com/WebLog/amachanic/articles/SplitString.aspx
BEGIN
DECLARE @LeftSplit VARCHAR(7998)
DECLARE @SplitStart INT SET @SplitStart = 0
DECLARE @SplitEnd INT
SET @SplitEnd = 7998

SELECT @SplitEnd = MAX(Number)
FROM dbo.Numbers
WHERE (SUBSTRING(@List, Number, 1) = @Delimiter
OR Number = DATALENGTH(@List) + 1)
AND Number BETWEEN @SplitStart AND @SplitEnd

WHILE @SplitStart < DATALENGTH(@List) - 1
BEGIN
SET @LeftSplit = @Delimiter + SUBSTRING(@List, @SplitStart, @SplitEnd -
@SplitStart) + @Delimiter

INSERT @ReturnTbl (OutParam)
SELECT LTRIM(RTRIM(SUBSTRING(@LeftSplit, Number + 1,
CHARINDEX(@Delimiter, @LeftSplit, Number + 1) - Number
- 1)))
AS Value
FROM dbo.Numbers
WHERE Number <= LEN(@LeftSplit) - 1
AND SUBSTRING(@LeftSplit, Number, 1) = @Delimiter
AND SUBSTRING(@LeftSplit, Number + 1,
CHARINDEX(@Delimiter, @LeftSplit, Number + 1) - Number
- 1) <> ''

SET @SplitStart = @SplitEnd + 1
SET @SplitEnd = @SplitEnd + 7998

SELECT @SplitEnd = MAX(Number) + @SplitStart
FROM dbo.Numbers
WHERE (SUBSTRING(@List, Number + @SplitStart, 1) = @Delimiter
OR Number+@SplitStart = DATALENGTH(@List) + 1)
AND Number BETWEEN 1 AND @SplitEnd - @SplitStart
END

RETURN
END
GO
Create table Noise(noiseword varchar(20))
GO
insert into Noise(noiseword) values('about')
insert into Noise(noiseword) values('after')
insert into Noise(noiseword) values('all')
insert into Noise(noiseword) values('also')
insert into Noise(noiseword) values('an')
insert into Noise(noiseword) values('and')
insert into Noise(noiseword) values('another')
insert into Noise(noiseword) values('any')
insert into Noise(noiseword) values('are')
insert into Noise(noiseword) values('as')
insert into Noise(noiseword) values('at')
insert into Noise(noiseword) values('be')
insert into Noise(noiseword) values('because')
insert into Noise(noiseword) values('been')
insert into Noise(noiseword) values('before')
insert into Noise(noiseword) values('being')
insert into Noise(noiseword) values('between')
insert into Noise(noiseword) values('both')
insert into Noise(noiseword) values('but')
insert into Noise(noiseword) values('by')
insert into Noise(noiseword) values('came')
insert into Noise(noiseword) values('can')
insert into Noise(noiseword) values('come')
insert into Noise(noiseword) values('could')
insert into Noise(noiseword) values('did')
insert into Noise(noiseword) values('do')
insert into Noise(noiseword) values('does')
insert into Noise(noiseword) values('each')
insert into Noise(noiseword) values('else')
insert into Noise(noiseword) values('for')
insert into Noise(noiseword) values('from')
insert into Noise(noiseword) values('get')
insert into Noise(noiseword) values('got')
insert into Noise(noiseword) values('has')
insert into Noise(noiseword) values('had')
insert into Noise(noiseword) values('he')
insert into Noise(noiseword) values('have')
insert into Noise(noiseword) values('her')
insert into Noise(noiseword) values('here')
insert into Noise(nois