Count Words Function¶
Overview¶
In SQL Server, and I believe as well as in other databases, there's no built-in function that will return the number of words a given string contains. Assuming that the space character separates each word, counting the number of words in a string can be performed using the following query on the given definition of a Health Insurance:
DECLARE @String VARCHAR(4000)
SET @String = 'Health Insurance is an insurance againstexpenses incurred through illness of the insured.'
SELECT LEN(@String) - LEN(REPLACE(@String, ' ', '')) + 1
This query will return a value of 13, which is the number of words in the given string. But if the words are separate by not just a single space character but by multiple spaces, then this will return an incorrect result as can be seen from the following:
DECLARE @String VARCHAR(4000)
SET @String = 'Health Insurance'
SELECT LEN(@String) - LEN(REPLACE(@String, ' ', '')) + 1
There are three spaces between the words "Health" and "Insurance" and the value returned by the query is 4 instead of just 2. The reason why it returns a value of 4 is because the query simply counts the number of spaces in the string and adds 1 to it. So since there are 3 spaces between the words, the result becomes 4.
Function Definition¶
Here's a user-defined function that returns the number of words there are in the given input string.
CREATE FUNCTION [dbo].[WordCount] ( @InputString VARCHAR(4000) )
RETURNS INT
AS
BEGIN
DECLARE @Index INT
DECLARE @Char CHAR(1)
DECLARE @PrevChar CHAR(1)
DECLARE @WordCount INT
SET @Index = 1
SET @WordCount = 0
WHILE @Index <= LEN(@InputString)
BEGIN
SET @Char = SUBSTRING(@InputString, @Index, 1)
SET @PrevChar = CASE WHEN @Index = 1 THEN ' '
ELSE SUBSTRING(@InputString, @Index - 1, 1)
END
IF @PrevChar = ' ' AND @Char != ' '
SET @WordCount = @WordCount + 1
SET @Index = @Index + 1
END
RETURN @WordCount
END
GO
Description¶
What the user-defined function does is to go through each character in the input string. It increments the word count variable (@WordCount
) by 1 every time it encounters a non-blank character and the previous character is a space (IF @PrevChar = ' ' AND @Char != ' '
). So even if there are multiple spaces separating each word, the word will only be counted once.
Usage¶
Applying the [dbo].[WordCount]
user-defined function to the same sample string above, which is the definition of a "Health Insurance", the SQL query will look like as follows:
DECLARE @String VARCHAR(4000)
SET @String = 'Health Insurance is an insurance against expenses incurred through illness of the insured.'
SELECT [dbo].[WordCount] ( @String )
The output of the SELECT statement will be 13, the number of words in the string.