Write a common sorting algorithm - Database - Techguide

Write a common sorting algorithm

 

Summary

A sorting algorithm routine helps users to sort a set of data, typically stored in an array, a set or other data structure.

Events

Echelon 2012
June 11 and 12, 2012

University Cultural Centre, National University of Singapore

Startup Asia Jakarta 2012
June 7 and 8, 2012

12th Floor, Annex Building, Wisma Nusantara Complex, Jl. M.H. Thamrin No. 59 Jakarta 10350, Indonesia

MMA Forum Singapore
April 23-25, 2012

Grand Hyatt Singapore

You are probably familiar with the built-in sorting abilities of SQL Server for columnar data, but what if you need to sort character strings stored in SQL Server fields?

Even though this may not be a common need for most users, this issue may come up from time to time.

I'll demonstrate how to write a common sorting algorithm using SQL Server TSQL code.

Sorting algorithms
A sorting algorithm is a routine that sorts a set of data, typically stored in an array, a set or other data structure.

There are many types of sorting algorithms--some of which are very fast, sophisticated and complex, and some are pretty basic.

It's best to write the more complex algorithms in iterative languages, such as C++, where it is easy to define data structures to hold the data, and the constructs to sort are readily available.

You can write the more simple algorithms in TSQL, although some iteration is required.

In this example, I will implement a variation of the Bubble Sort sorting algorithm, which is one of the easiest sorting algorithms to implement because of its "brute force" nature. It loops through the characters, checking each one against the one next to it until it has determined that the strings (or numbers) are in sorted order.

Because of how it's implemented, Bubble Sort is one of the worst performing sorting algorithms, and it is only used for sorting small sets of data.

My example will contain small sets, so this algorithm will be fine, although I wouldn't run it on a field in a table with millions of records.

In "algorithm terms" Bubble Sort is O(n2), which means the maximum number of operations to sort the set is the number of items in the set squared. For my example, this number will be a little bit worse because of the manner in which I have to do some of the swapping in TSQL.

If this algorithm was implemented in C++, you could write it in such a way as to performing at O(n2).

The example
Before I get to the sorting function, I need to write a helper function to assist me with the sorting. This function will do the work of swapping two characters based on specific positions in the character string.

If I have the string called elephant, and I need to swap the second and fourth characters of the string, my swap function will produce epelhant. It may not make much sense to do this right now, but it is vital to the way in which our Bubble Sort algorithm works. (The example in this article applies to SQL Server 2000 and later.)

CREATE FUNCTION dbo.udf_Swap
 (
     @fullstring VARCHAR(1000),
     @charlocation1 TINYINT,
     @charlocation2 TINYINT
 )
 RETURNS VARCHAR(1000)
 AS
 BEGIN
         DECLARE @returnval varchar(1000)         

         DECLARE @begin VARCHAR(1000), @middle VARCHAR(1000), 
         @end VARCHAR(1000)
         DECLARE @firstchar CHAR(1), @secondchar CHAR(1), @len INT         

         SET @fullstring = LTRIM(RTRIM(@fullstring))
         SET @len = LEN(@fullstring)         

     IF @charlocation1 > @len OR @charlocation2 > @len
         SET @returnval = @fullstring
         ELSE
         BEGIN
                SET @firstchar = SUBSTRING(@fullstring, 
                @charlocation1, 1)
                SET @secondchar = SUBSTRING(@fullstring, 
                @charlocation2, 1)
                SET @begin = LEFT(@fullstring, 
                (@charlocation1-1))
                SET @middle = SUBSTRING(@fullstring, 
                @charlocation1+1, (@charlocation2-@charlocation1)-1)
                SET @end = SUBSTRING(@fullstring, 
                @charlocation2+1, @len)
                SET @returnval = @begin + @secondchar + @middle + 
                @firstchar + @end
         END         

     RETURN(@returnval)
 END

Now that I have my helper function, I can write the sorting function. This function, udf_SortString, accepts a string of characters and returns them in sorted order. If there are numbers, they will be returned first, followed by any alpha characters. If the string has one or more words, the words will be sorted individually.

There are two loops in this function that do the sorting work. The first loop's job is to loop until characters have been swapped. Once a set of characters have been swapped, it is time to start at the beginning and do another scan. This loop will end once no characters have been swapped, and the string is in sorted order.

The second loop examines each character in the string and compares it to the next character in the string. If the characters are in sorted order, it moves to the next character. If the characters are not sorted, it uses the function I created earlier and swaps them. Then the loop starts over, and the characters are examined again and again until no more swapping occurs. Once a pass is made and there is no swapping, the string is in sorted order.

CREATE FUNCTION udf_SortString
 (
     @string VARCHAR(1000)
 )
 RETURNS VARCHAR(1000)
 AS
 BEGIN
     DECLARE @len TINYINT
     DECLARE @i TINYINT
     DECLARE @currentchar CHAR(1)
     DECLARE @swapped BIT
     DECLARE @begin BIT
     DECLARE @nextchar CHAR(1)         

     SET @begin = 1
     SET @len = LEN(@string)
     SET @i = 1         

     WHILE @begin = 1 OR @swapped = 1
     BEGIN
         SET @swapped = 0
         SET @i = 1
         SET @begin = 0         

         WHILE @i <= @len
         BEGIN         

             SET @currentchar = SUBSTRING(@string, @i, 1)
             SET @nextchar = SUBSTRING(@string, @i + 1, 1)
             IF @currentchar > @nextchar AND (@nextchar > '')
             BEGIN
                 SET @string = dbo.udf_swap(@string, @i, @i + 1)
                 SET @swapped = 1
             END
             SET @i = @i + 1         

         END
     END
     RETURN(@string)
 END
 GO

All I need to do is call my function to sort a string:

SELECT dbo.udf_SortString('asdofimasdfasdf23rasdf')

Sort it out
Most database users don't think about how much work goes into sorting data and searching for data because the database provides the functionality. By understanding how this works, it gives me a greater appreciation of the functionality.

While you may not apply the information you learned in this example for most users, I wanted to show how you can write your own string sorting functions in TSQL if the need arises. Also, I wanted to provide an overview. Hope you learned about sorting algorithms if you weren't already aware of them.

Tim Chapman, an SQL Server database administrator and consultant who works for a bank, has more than eight years of IT experience, and he is a Microsoft certified Database Developer and Administrator. If you would like to contact him, please e-mail him at chapman.tim@gmail.com.

Talkback

Add your opinion

In order to post a comment, you need to be registered. (Sign In or register below)

Post your comment

ZDNet Asia Live

42 bands from 15 countries to feature at Music Matters Live 2012 which will beam live via YouTube for 1st time this year. #mm12

Music Matters to be launched in Bali via partnership w/Telkom Indonesia. #mm12

HP to shed 27K workers by 2014 http://t.co/OevueOGh http://t.co/erFSwAUB #arcavir

http://t.co/VNaUVSe1 HP to shed 27K workers by 2014: IT vendor plans exit of 8 percent of gl... http://t.co/5LKpdBSZ http://t.co/wiqTBKkj

China solar cell makers seek Taiwan partnershipshttp://bit.ly/JErUGz via @zdnetasia #solar #energy #china

Malaysia organizations don't realize severity of cyberattacks http://t.co/PUCv68Rd

News: Radio Costa Rica by EnjoyIT 1.0: Radio Costa Rica allows you to listen to a great var... http://t.co/BLzVT5As http://t.co/1Dhcy6ki

The key for mobile operators is identifying the applications that are popular with subscribers on their network. They can then work partn...

3 hours ago by camcullen on Experience trumps content in apps monetization

Experience trumps content in apps monetization | ZDNet http://t.co/gBXcjbGd

Experience trumps content in apps monetization - ZDNet Asia News: "What we are doing currently is not to monetiz... http://t.co/S2EZtd8m

Malaysia organizations don't realize severity of cyberattacks: "Minister Maximus Johnity Ongkili said at the Sec... http://t.co/bgVlOBvx

#security Malaysia organizations don't realize severity of cyberattacks: "Minister Maximus Johnity Ongkili said ... http://t.co/hkFb4zrI

Malaysia organizations don't realize severity of cyberattacks http://t.co/EEEmRM3j via @zdnetasia

Malaysia organizations don't realize severity of cyberattacks - ZDNet Asia News http://t.co/YpNMYgb5

Malaysia organizations don't realize severity of cyberattacks http://t.co/FFems54Q

China solar cell makers seek Taiwan partnerships http://t.co/p5Hh7kJD

So much as we know , MTK6575 extremely integrated frequency1GHz ARM Cortex-A9 processor, the superiority of 3G / HSPA Modem, and help the...

1 day ago by y15822137359 on 5 SaaS adoption speed bumps to avoid

I reckon your view: "CRM is strategy, not software", if a company replicating the approach uses in ERP implementation into CRM, what they...

3 days ago by wykoong on Gartner: Mobile CRM gives better ROI than social

This video will teach you about the Excel fill handle but also provide you with a workook to download... http://www.youtube.com/watch?v=...

3 days ago by TradeBrother on A quick fill handle trick for Microsoft Excel

waiting...

5 days ago by eapete on What should count in a company's market value?

Boy, you've opened a can of worms now.

Wait for the rants & raves.

5 days ago by eapete on What should count in a company's market value?

I was puzzling before this whether to replicate the success formula we executed for a financial institute, and come out with a standard s...

6 days ago by wykoong on Drop the egos, copy ideas, then innovate