Pogo69's Blog

July 12, 2012

SQL Server (2005+) Ranking Functions – A Better Mailout Boxing Algorithm

Filed under: Cutting Code, SQL Server, Transact SQL — pogo69 [Pat Janes] @ 15:41

Introduction

I’d been wanting to post about the joys of SQL Server Transact SQL’s (new to SQL Server 2005) ranking functions since learning of them.  I wrote this article originally as an explanation to my fellow development team members about what I had done and why I did it.

It then morphed into a blog/article posting that never found a home.  It was lost forever until this morning when I resurrected a copy of it from a hard drive that had been sitting in a long dead notebook.

Overview

As a developer, there is so much to love about SQL Server 2005.

In my current role, our primary responsiblity, an Intranet, is largely AJAX based, so we’ve taken great pleasure in converting such things as the formerly clunky FOR XML EXPLICIT syntax to the far more elegant FOR XML PATH syntax; allowing us to generate XML documents in a largely “native” manner.

That specific need aside, the new feature that has afforded us the greatest improvements in code clarity and performance are the SQL Server ranking functions.

This article will describe a recently encountered problem and how it was solved using those ranking functions to improve:

  • the performance of the code
  • the readability of the code and;
  • maintainability of the code

Boxing Algorithms for Mailouts

In about three weeks time, I’ll be starting a new job; as such, I am, in my current role, starting to wind down. This means that the majority of my time has been allocated to finishing up current projects, documenting what I know and transferring knowledge to other team members.

In between times, I’ve been devoting my “spare” time to those areas of the system that I’ve “always wanted to improve, but never had the time”.

A large sub-system of our Intranet is devoted to the selection of “contact” data and the subsequent mailout of invitations to those contacts. A couple of years ago, the company expanded it’s operations into New Zealand; this brought with it, some unique issues, not least of which were some very specific requirements from NZ Post, with respect to the “boxing” of bulk mail in order to receive bulk discounts.

In New Zealand, postcodes are allocated in groups to “Lodgement Centres”. The mail for each lodgement centre must be “boxed” into groups of 300, sorted by descending postcode.

Ever since we upgraded our system to SQL Server 2005 from it’s former SQL Server 2000, and I was introduced to T-SQL’s new ranking functions, I could see that this problem was crying out to be solved using them; but I never had the time to re-write existing code. Yesterday, I finally got around to doing so.

NB: You’ll see the use of the new XML datatype in both the old and new version of the code; this is due to that portion of the code already having been converted post SQL Server 2005 upgrade.

The Old Solution (Cursors – ewww)

The former solution uses a cursor to iterate through the contact/invitation records, such that we can deal with them per lodgement centre:

        -- @xmlSeminars is actually passed into a stored procedure, but is explicitly declared for -- clarity in this example
        DECLARE @xmlSeminars XML;
        SELECT @xmlSeminars = '<seminars><seminar id="123" /><seminar id="456" /></seminars>';

        CREATE TABLE #invite
        (
            intSeminarContactID INT,
            intLodgementCentreID INT,
            strPostCode VARCHAR(6)
        );

        CREATE TABLE #lodgement_centre
        (
            intOrder INT IDENTITY(0, 1),
            intSeminarContactID INT,
            strPostCode VARCHAR(6)
        );

        CREATE TABLE #invite_lodgement_centre
        (
            intOrder INT,
            intSeminarContactID INT,
            intLodgementCentreID INT,
            strPostCode VARCHAR(6),
            intBox INT
        );

        INSERT
            #invite
        SELECT
            sc.intSeminarContactID,
            lcb.intLodgementCentreID,
            a.strPostCode
        FROM
            tblSeminarContact sc
        JOIN
            tblContact c ON c.intContactID = sc.intContactID
        JOIN
            tblAddress a ON a.intAddressID = c.intAddressID
        JOIN
            tblLodgementCentreBand lcb ON a.strPostCode BETWEEN lcb.strPostCode_From AND lcb.strPostCode_To
        JOIN
            @xmlSeminars.NODES('seminars/seminar') AS seminars(seminar) ON seminars.seminar.VALUE('@id', 'INT') = sc.intSeminarID;

        DECLARE @intLodgementCentreID INT;

        DECLARE curLodgementCentre CURSOR FOR
            SELECT DISTINCT
                #invite.intLodgementCentreID
            FROM
                #invite;

        OPEN curLodgementCentre;

        FETCH NEXT FROM curLodgementCentre
        INTO @intLodgementCentreID;

        WHILE @@FETCH_STATUS = 0
        BEGIN
            INSERT
                #lodgement_centre
            SELECT
                #invite.intSeminarContactID,
                #invite.strPostCode
            FROM
                #invite
            WHERE
                #invite.intLodgementCentreID = @intLodgementCentreID
            ORDER BY
                #invite.strPostCode;

            INSERT
                #invite_lodgement_centre
            SELECT
                #invite.intOrder,
                #inviteintSeminarContactID,
                @intLodgementCentreID,
                #invite.strPostCode,
                #invite.intOrder / 300
            FROM
                #invite;

            FETCH NEXT FROM curLodgementCentre
            INTO @intLodgementCentreID;
        END

        CLOSE curLodgementCentre;
        DEALLOCATE curLodgementCentre;

This causes the following problems:

  • it hampers performance, due to:
    • using programmatic iterative operations instead of set based operations, which is what SQL Server is good at
    • the overhead of cursors
  • it obfuscates the intention of the boxing algorithm

The New Solution (Ranking Functions – ROW_NUMBER and DENSE_RANK)

The new code solves all of these problems in two simply elegant, set based operations.

1. Ordering and Partitioning

        -- @xmlSeminars is actually passed into a stored procedure, but is explicitly declared for clarity in this example
        DECLARE @xmlSeminars XML;
        SELECT @xmlSeminars = '<seminars><seminar id="123" /><seminar id="456" /></seminars>';

        CREATE TABLE #invite
        (
            intSeminarContactID INT,
            intLodgementCentreID INT,
            strPostCode VARCHAR(6),
            intOrder INT
        );

        CREATE TABLE #invite_lodgement_centre
        (
            intSeminarContactID INT,
            intLodgementCentreID INT,
            strPostCode VARCHAR(6),
            intOrder INT,
            intBox INT
        );

        INSERT
            #invite
        SELECT
            sc.intSeminarContactID,
            lcb.intLodgementCentreID,
            a.strPostCode,
            (ROW_NUMBER() OVER (PARTITION BY lc.intLodgementCentreID ORDER BY a.strPostCode DESCENDING)) - 1      -- (1)
        FROM
            tblSeminarContact sc
        JOIN
            tblContact c ON c.intContactID = sc.intContactID
        JOIN
            tblAddress a ON a.intAddressID = c.intAddressID
        JOIN
            tblLodgementCentreBand lcb ON a.strPostCode BETWEEN lcb.strPostCode_From AND lcb.strPostCode_To
        JOIN
            @xmlSeminars.NODES('seminars/seminar') AS seminars(seminar) ON seminars.seminar.VALUE('@id', 'INT') = sc.intSeminarID;

(1)
This initial INSERT statement makes use of the new ROW_NUMBER() ranking function that allows us to generate an ordinal number (ORDER BY) for each record in the set, optionally partitioning those ordinal numbers into groups (PARTITION BY); the results generated look something like the following:

intSeminarContactID intLodgementCentreID strPostCode intOrder
123 1 9999 0
234 1 9999 1
345 1 9998 2
…elided…
456 1 9996 299
567 1 9996 300
678 1 9996 301
789 1 9995 302
4234 2 6666 0
1234 2 6666 1
1235 2 6666 2
1236 2 6665 3
1237 2 6665 4
1238 2 6665 5
1239 2 6664 6
1211 2 6664 7
…elided…

NB: I decrement the ordinal number returned by ROW_NUMBER() by one in (1) above, to allow a simple integer division calculation in the following statement:

2. Boxing

        INSERT
            #invite_lodgement_centre
        SELECT
            #invite.intSeminarContactID,
            #invite.intLodgementCentreID,
            #invite.strPostCode,
            #invite.intOrder,
            #invite.intOrder / 300 + DENSE_RANK() OVER (ORDER BY #invite.intLodgementCentreID);                                                         -- (2)

(2)
The first part of statement (2) divides #invite.intOrder / 300 our invitations into boxes of 300.

The second part of the statement “corrects” the boxing such that each lodgement centre has a distinct set of boxes; this works because the DENSE_RANK() “ranks” the dataset according to the content of the ORDER BY CLAUSE. We use DENSE_RANK() instead of RANK() so that the intBox box identifiers are kept contiguous.

So now, our new dataset has each invitation record, allocated to a “box” of 300 or fewer, where each box contains records from only a single lodgement centre.

intSeminarContactID intLodgementCentreID strPostCode intOrder intBox
123 1 9999 0 1
234 1 9999 1 1
345 1 9998 2 1
…elided…
456 1 9996 299 1
567 1 9996 300 2
678 1 9996 301 2
789 1 9995 302 2
4234 2 6666 0 3
1234 2 6666 1 3
1235 2 6666 2 3
1236 2 6665 3 3
1237 2 6665 4 3
1238 2 6665 5 3
1239 2 6664 6 3
1211 2 6664 7 3
…elided…

Conclusion

As you can see, the ranking functions clean up the code quite considerably; in addition to being far simpler, the code is now far easier to maintain, with the original intention of the code no longer obfuscated behind a wall of crazy cursor logic.

A DBMS will always be far happier working with set based operations; the introduction of ranking functions in SQL Server 2005 is one more way in which we can keep SQL Server, and those of us who work with it everyday, happy.

March 28, 2012

CRM SSRS Reports, SQL CLR and External Data – A Case Study

Filed under: .NET, C#, CRM, Cutting Code, SQL Server, Transact SQL — pogo69 [Pat Janes] @ 14:21

I met some interesting requirements recently, wherein I needed to produce a CRM Report that joined standard CRM Data using the FilteredEntityName views with data provided by a 3rd party dll.

Given the complexities of the resultant solution, I thought it would make for an interesting Case Study.  Hopefully, others can learn something from my experiences and perhaps broaden their minds to the possibilities of CRM Integration.

Architectural Overview

A picture speaks a thousand words, so I’ll shutup for a moment and:

In summary:

  1. My SSRS Report (uploaded to CRM) renders data in a single table provided by;
  2. An SQL query that joins data retrieved from;
  3. A CRM Database FilteredEntity View and;
  4. A Table Valued Function wrapping an;
  5. SQL-CLR Table Valued Function that retrieves the required data from;
  6. A Web Service that calculates Fatigue ratings using;
  7. A 3rd-party 32-bit DLL

Now, if that seems a little bit convoluted, keep reading for the reasoning behind this spider-web of data manipulation and retrieval.

I will start backwards, as the final piece of the puzzle is actually the beginning of the story.

How (and Why) It All Works

32-bit DLLs and CRM 2011

In contrast to CRM 4.0 that preceded it, CRM 2011 is now available only as a 64-bit installation.  This means that if you ever have a need to access 32-bit DLLs (legacy code, 3rd-party functionality, pretty much every ODBC driver ever written) you are unable to do so unless you write a “thunking” layer; providing access to the 32-bit functionality in a separate memory space in order to maintain compatibility.

I chose to wrap my 32-bit 3rd-party DLL in a WCF Service.  I have exposed several endpoints, including a basicHttpBinding that allows interoperability with older SOAP based clients – this is necessary for interaction from an SQL-CLR object as the requisite assemblies are not available to create a more modern WCF client.

As touched upon briefly above, I have implemented the same sort of idea to interoperate with ODBC drivers for legacy database applications that provide no other interface.

P/Invoke – The Murky World of Managed/Unmanaged Interoperability

I will just briefly mention a tool called the ‘P/Invoke Interop Assistant’ that I found invaluable when building a C# wrapper for the above mentioned 32-bit DLL.  It was written in C and exposed only a legacy C based interface that cannot be easily consumed by managed code; which is where P/Invoke comes in.

By adding your legacy 32-bit DLL to your managed project and marking it as content, it will be copied to your output folder.  You must then create DllImport wrappers for any functions that you wish to access e.g. (names changed to protect the innocent:

[DllImport("xxxx.DLL", EntryPoint="_xxx_yyyy@nn")]
 internal static extern int xxx_yyyy(
 int p1,
 ref int p2,
 int p3,
 ref int p3,
 ref int p4,
 ref int p5,
 ref double p6,
 ref double p7,
 int p8,
 ref int p9,
 ref double p10
);

I won’t go into the nightmare that was trying to figure out exactly what that signature should look like, nor the corresponding nightmare that was figuring out how to properly talk to it.  However, I can say that the P/Invoke Interop Assistant made life somewhat easier.

SQL-CLR Table-Valued Functions are awesome!

I got so excited about SQL-CLR when we first came in contact with each other.  In fact, I was all over a whole stack of new “toys” that became available with SQL Server 2005 (I will mention another, Common Table Expressions or CTEs a little later).

SQL-CLR allows you to write a managed assembly (.NET code) that can perform tasks not ordinarily available via Transact SQL and access that functionality via a Stored Procedure or Function wrapper.  My first attempt at such an assembly was to expose some simple .NET Regular Expression pattern matching/replacing functionality, which became invaluable (and surprisingly fast) for a data cleansing project on which I was working.

In this case, however, I wrote my first SQL-CLR Table Valued function; which does exactly what it says – returns data from a CLR managed assembly that is represented and consumed in Transact SQL as a Table.

So; what does an SQL-CLR Table Valued function look like?  I’m glad you asked.  I will let MSDN explain in details, but basically, you have to implement two methods:

  1. One method does that actual data retrieval/collection/however you obtain the data – this method must return the interface System.Collections.IEnumerable (or something that implements same) and will be marked with metadata describing various SQL related properties and the name of the method that will be used to “feed” the data back to SQL Server
  2. The second method does the feeding – taking a generic Object (which are individual elements of the datatype returned from the first method), ripping it apart, and populating out parameters for each column in the resultset.  This allows SQL Server to stream the data back to the query that calls your assembly
Example as follows:
using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using System.Collections.Generic;
using System.Linq;
using System.Xml;
using Microsoft.SqlServer.Server;
using BMP.FAID.SqlClr.WebServiceWrapper;
public partial class CalculateFatigue
{
 [SqlFunction(FillRowMethodName = "FillRow", DataAccess = DataAccessKind.None, TableDefinition = "Start datetime, Fatigue float", IsDeterministic = true, IsPrecise = false)]
 public static System.Collections.IEnumerable InitMethod(SqlXml rosterPeriods, string faidServiceUrl)
 {
  return WebService.Process(rosterPeriods.Value, faidServiceUrl);
 }

 public static void FillRow(Object obj, out SqlDateTime start, out SqlDouble fatigue)
 {
  HourlyFatigue hourBlock = (HourlyFatigue)obj;
  start = hourBlock.Start;
  fatigue = hourBlock.Fatigue;
 };
}

NB: I implemented my Web Service Wrapper in another assembly so that I could unit test it outside of SQL Server.

As per the in-depth explanation of SQL-CLR Table Valued functions, I will leave the details of the deployment of these managed assemblies to your MSDN searching capabilities – in fact, I think I may have gone through the process in another blog posting.

The Transact SQL Query

Apologies for the verbosity, but there are several elements of the query that I would like to explain for those who are not familiar with the concepts within.

--declarations and setup (hard-coded for demonstrative purposes)
declare
 @pilotId uniqueidentifier = 'A4D1BAF3-BC41-E111-B9ED-00012E244939',
 @year int = 2012,
 @month int = 2;

declare
 @monthBegin datetime = (select cast(@year as nvarchar) + '-' + cast(@month as nvarchar) + '-01');
declare
 @monthEnd datetime = DATEADD(SECOND,-1,DATEADD(Month,1,@monthBegin));

declare
 @daysInMonth int = (select DAY(@monthEnd));

declare
 @faidServiceUrl nvarchar(255) = (select top 1 mrcrm_faidserviceurl from Filteredmrcrm_rosterandavailabilitysettings);

create table #HourBlock
(
 [Start] [datetime] NULL,
 [Fatigue] [float] NULL
);

-- find roster items
declare
 @rosterPeriods xml =
 (
  select
   rsi.mrcrm_Start, rsi.mrcrm_End, rt.mrcrm_RiskLevel
  from
   mrcrm_rosterscheduleitem rsi
  join
   mrcrm_rostertype rt on rt.mrcrm_rostertypeId = rsi.mrcrm_RosterTypeId
  where
   rsi.mrcrm_EmployeeId = @pilotId
  and
   rt.mrcrm_RiskLevel > 0
  and
   rsi.mrcrm_Start <= @monthEnd
  and
   rsi.mrcrm_End >= DATEADD(DAY, -14, @monthBegin)
  and
   rsi.statecode = 0
  order by
   rsi.mrcrm_Start
  for xml path('period'), root('roster')
 );

-- use faid service to calculate hourly fatigue
insert
 #HourBlock
select
 Start, Fatigue
from
 [BMP_Functions].[dbo].[InitMethod](@rosterPeriods, @faidServiceUrl);

;with
-- obtain list of all hours in the day
Hour_CTE (Hour)
as
(
 select 0

 union all

 select Hour + 1 from Hour_CTE where Hour < 23
),
-- obtain list of all days in the month
Day_CTE (Day)
as
(
 select @monthBegin

 union all

 select DATEADD(Day, 1, Day) from Day_CTE where Day < @monthEnd
)
-- join it all up together
select
 Day_CTE.Day,
 Hour_CTE.Hour,
 ISNULL(h.Fatigue, 0.0) as Fatigue
from
 Hour_CTE
cross join
 Day_CTE
left join
 #HourBlock h on DATEPART(Hour, h.Start) = Hour_CTE.Hour and Day(h.Start) = Day(Day_CTE.Day);

Again; apologies for the verbosity but it is as simple as I could make it while still demonstrating what I had hoped (yes, it is more complicated than that).  Now, to break it down.

Configuration Data

I store system-wide configuration data in a “Settings” entity instance in the CRM.  This allows me to set variables such as Web Service URLs centrally and access them from anywhere that has access to CRM Data.  CRM components access the Fatigue Web Service that I show here in exactly the same way via plugins etc.

declare
 @faidServiceUrl nvarchar(255) = (select top 1 mrcrm_faidserviceurl from Filteredmrcrm_rosterandavailabilitysettings);

Table Inputs into SQL-CLR Functions (are not allowed) – ‘FOR XML PATH’

In order to calculate Fatigue ratings, the 3rd-party component I was using requires a “table” of input data.  Unfortunately, although you can output a table of data from an SQL-CLR function, you cannot input such data.  So I decided to convert my input data into XML and feed it into my managed assembly function via an SQL XML datatype.

This is another of the “new for SQL Server 2005” Transact SQL enhancements that I was very excited about upon first use.  While the “FOR XML …” syntax was available prior to SQL Server 2005, the introduction of “FOR XML PATH…” made the structured formatting of XML data so much cleaner.

-- find roster items
declare
 @rosterPeriods xml =
 (
  select
   rsi.mrcrm_Start, rsi.mrcrm_End, rt.mrcrm_RiskLevel
  from
   mrcrm_rosterscheduleitem rsi
  join
   mrcrm_rostertype rt on rt.mrcrm_rostertypeId = rsi.mrcrm_RosterTypeId
  where
   rsi.mrcrm_EmployeeId = @pilotId
  and
   rt.mrcrm_RiskLevel > 0
  and
   rsi.mrcrm_Start <= @monthEnd
  and
   rsi.mrcrm_End >= DATEADD(DAY, -14, @monthBegin)
  and
   rsi.statecode = 0
  order by
   rsi.mrcrm_Start
  for xml path('period'), root('roster')
 );

This query outputs data in the following format:

<roster>
 <period>
  <mrcrm_Start>2012-02-06T12:00:00</mrcrm_Start>
  <mrcrm_End>2012-02-06T13:30:00</mrcrm_End>
  <mrcrm_RiskLevel>3</mrcrm_RiskLevel>
 </period>
</roster>

Such data can then be processed in C# using the usual System.Xml provided mechanisms.

Using the SQL-CLR Table Valued Function

Using the SQL-CLR Table Valued Function is as simple as referencing its “wrapper” user-defined function, passing in any required parameters (in this case, the XML data and web service URL) and selecting from the resultant table.

-- use faid service to calculate hourly fatigue
insert
 #HourBlock
select
 Start, Fatigue
from
 [BMP_Functions].[dbo].[InitMethod](@rosterPeriods, @faidServiceUrl);

Common Table Expressions or CTEs – Outputting the Data

Also newly introduced in SQL Server 2005, is the concept of Common Table Expressions.  CTEs provide the equivalent of an inline function, which can be very handy when building reports for Dynamics CRM given that user-defined elements such as custom stored procedures and user-defined functions are disallowed within the CRM database.

NB: in the case of my SQL-CLR function, I stored the managed assemblies and function in a separate database.

In addition to providing a mechanism for an “inline” function, CTEs allow the definition of a recursive function as per my Hour_CTE and Day_CTE below; in this case, I am building a list of:

  1. Each Hour of the Day (0-23)
  2. Each Day in the Month (1-however many days are in the current month)
Using the results of the CTEs and SQL-CLR function, the results are output via a select statement – CROSS JOINS are used to ensure that a value is produced for each combination of Day and Hour.
;with
-- obtain list of all hours in the day
Hour_CTE (Hour)
as
(
 select 0

 union all

 select Hour + 1 from Hour_CTE where Hour < 23
),
-- obtain list of all days in the month
Day_CTE (Day)
as
(
 select @monthBegin

 union all

 select DATEADD(Day, 1, Day) from Day_CTE where Day < @monthEnd
)
-- join it all up together
select
 Day_CTE.Day,
 Hour_CTE.Hour,
 ISNULL(h.Fatigue, 0.0) as Fatigue
from
 Hour_CTE
cross join
 Day_CTE
left join
 #HourBlock h on DATEPART(Hour, h.Start) = Hour_CTE.Hour and Day(h.Start) = Day(Day_CTE.Day);

SSRS and SQL-CLR Functions – A Difficulty

One final word on my experiences with this process.

There is a well-known issue with SSRS Reports, wherein the rendering of large quantities of data can cause excess memory consumption and the resultant failure of a Report to complete.  I encountered this issue, however the supposed cause of such issues did not apply to me – 25 columns and 31 rows of data (in the Production Report) does not constitute a large quantity of data.  Nor does the single, lone table rendering that data constitute a complex visual display.

While I cannot prove causality, I was able to overcome the memory consumption issue, by moving my table and query defined above into a Sub-Report.

My hypothesis is that there is something in my query that could not deal with another DataSet having been defined in the same report (I was using other DataSets to provide data for the Year, Month and Pilot parameters).  This hypothesis is supported by the subsequent removal of the additional DataSets and hard-coding the parameters, allowing the successful rendering of the Report.

I asked for assistance in the SSRS MSDN Support Forums, but was unable to elicit any useful information.  When time permits, I will follow up with this new discovery (I have only just stumbled upon the Sub-Report workaround) and see if it sparks any further ideas.

 

May 10, 2011

CRM Web Services from SQL Server Transact SQL (SQLCLR)

Filed under: C#, CRM, SQL Server, Transact SQL — pogo69 [Pat Janes] @ 17:53

Let me preface this post by saying “bugga!!”.  Sometime last year, I assisted one of my fellow CRM Community Forum members through the process of building a SQL CLR assembly that communicates with CRM via web services.  This mechanism allows us to synchronise SQL Server updates with CRM in a supported manner by calling our assembly via Create/Update triggers in our 3rd party SQL Server database.  I created the shell for this post quite some time ago, but never got around to starting it, let alone finishing.

Now that CRM 2011 has been released, I was hoping to update the process to describe the steps necessary to create a SQLCLR assembly that communicates with the new CRM 2011 OrganizationService.  Unfortunately, this is not possible, as SQL Server 2005 and SQL Server 2008 (up to and including R2) do not support the .NET Framework v4.0.

So… back to the CRM 4.0 SOAP endpoint it is!!

Creating the SQLCLR Assembly

The example I’m going to describe will allow the creation of a new Contact record from within an SQLCLR Stored Procedure.

New Visual Studio Project

In Visual Studio:

  1. Click the ‘File’ menu
  2. Click ‘New’
  3. Click ‘Project…’
  4. In the Project window, select the ‘Visual C# SQL CLR Database Project’ in the Database / SQL Server category:

In the ‘New Database Reference’ dialog, choose or enter the name of the target SQL Server instance; then select the target database:

Choose ‘No’ for SQL/CLR debugging (or ‘Yes’ if you wish, but I’m not going to go into that here):

New Stored Procedure class

Next, we create the shell of our new SQLCLR Stored Procedure.

  1. Right-click on the Project
  2. Click ‘Add’
  3. Click ‘Stored Procedure…’

In the ‘Add New Item’ dialog, enter the name of your SQLCLR Stored Procedure:

Your newly created Stored Procedure class will look like the following:

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;

public partial class StoredProcedures
{
	[Microsoft.SqlServer.Server.SqlProcedure]
	public static void CreateContact()
	{
		// Put your code here
	}
};

CRM Web Reference

Now, we need to add a Web Reference to the CRM 4.0 SOAP endpoint.  It is via this interface that we will communicate with the CRM.

  1. Right-click on the ‘References’ folder
  2. Click ‘Add Service Reference…’

Click on ‘Advanced…’:

Click on ‘Add Web Reference…’:

In the ‘Add Web Reference’ dialog, enter the fully qualified URL path to your CRM Service; then click on Green arrow icon next to the URL:

The CRM Web Service URL takes the following format:

http://<server>:<port>/MSCRMServices/2007/CrmService.asmx

Visual Studio will send a request to the CRM Service, retrieve the WSDL and build a Web Service proxy, which you must now name in the ‘Web reference name’ field:

Click on ‘Add Reference’ and you will see your newly added Web Reference in the Project’s ‘Web References’ folder:

Using the CRM Service to Create a Contact

Stored Procedure Signature

Next, we’ll add some parameters that can be used to create our new Contact.  Our contacts will each have:

  • First Name
  • Last Name
  • Email Address
public static void CreateContact(SqlString firstName, SqlString lastName, SqlString email)
{
	// Put your code here
}

Each of the SQL Server datatypes is represented by a SQLCLR “wrapper” object, that encapsulates the underlying .NET object.  This sample uses only SqlString, which encapsulates the .NET “string” datatype and represents a SQL Server NVARCHAR.

CrmSdk Namespace

I’ve added a “using” directive, so that we can use the CRM Web Service objects and methods directly:

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;

using SqlServerProject1.CrmSdk;

public partial class StoredProcedures
{
	[Microsoft.SqlServer.Server.SqlProcedure]
	public static void CreateContact(SqlString firstName, SqlString lastName, SqlString email)
	{
	}
};

CRM Web Service Connection and Interactivity

At this point, we can write CRM Web Service code just like we always have in our plugins, 3rd party applications and whatever we decide to implement.

  1. Obtain an Authentication Token
  2. Create a Web Service connection
  3. “Do stuff”
using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;

using SqlServerProject1.CrmSdk;

public partial class StoredProcedures
{
	[Microsoft.SqlServer.Server.SqlProcedure]
	public static void CreateContact(SqlString firstName, SqlString lastName, SqlString email)
	{
		CrmAuthenticationToken token = new CrmAuthenticationToken();
		token.AuthenticationType = 0;
		token.OrganizationName = "<uniqueorgname>";

		using (CrmService crmService = new CrmService())
		{
			crmService.Url = "http://<server>:<port>/MSCRMServices/2007/CrmService.asmx";
			crmService.Credentials = new System.Net.NetworkCredential("<username>", "<password>", "<domain>");
			crmService.CrmAuthenticationTokenValue = token;
			crmService.UnsafeAuthenticatedConnectionSharing = true;
			crmService.PreAuthenticate = true;

			contact contact = new contact();
			contact.firstname = firstName.Value;
			contact.lastname = lastName.Value;
			contact.emailaddress1 = email.Value;

			crmService.Create(contact);
		}
	}
};

Side Issues

Exception Handling

The code above has no exception handling.  Your production code should.  But I’m leaving that as an exercise for the reader, as it is largely an application-dependant architectural issue.

Web Service Connection Parameters

The Web Service connection parameters (credentials, URL, Organisation Name) etc are hard-coded.  Again, configurable alternatives are left as an exercise for the reader, although I may come back to this issue at a later time.  For now, some options to consider include:

  • Passing in Connection data as additional parameters
  • Storing Connection data in the Database
  • Storing Connection data in the Registry or File System

XML Serialisation Assembly

One of the restrictions of SQLCLR is that you cannot make use of any .NET Assembly that has not explicitly been loaded into the target SQL Server database, other than a very restricted set of .NET Framework assemblies that have been deemed “safe”.

A side-effect of this, is that an assembly such as ours that utilises Web Services will not work unless we manually generate an XML Serialisation Assembly.

The .NET runtime uses a Serialisation assembly to package Web Service requests into XML packets for transfer via HTTP to the Web Service host.  Ordinarily, serialisation assemblies are generated “on the fly” by the Framework, however we cannot make use of this mechanism as there is no way to dynamically load the resultant assembly into SQL Server.

To generate a Serialisation assembly from within Visual Studio:

  1. Right-click on the Project
  2. Click ‘Properties’
  3. Navigate to the ‘Build’ tab
  4. Change ‘Generate serialization assembly’ to ‘On’

Using sgen.exe from the Command Line

If you experience any issues with the generation of serialisation assemblies from within Visual Studio, you can use sgen.exe directly from the command line.  I do this for all of my serialisation code because:

  • There is no way to specify sgen.exe parameters from within the Visual Studio GUI
  • It takes a ridiculously long time for sgen.exe to complete using whatever parameters Visual Studio chooses to throw at it

I create a post-build task that runs the command from within the build process; the command line I use is as follows:

"C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\Bin\sgen.exe" /a:"<drive>:\<path to solution>\<project folder>\bin\Debug\SqlServerProject1.dll" /p /force

I then wrap the sgen.exe command up in a batch file.  You can configure the Post Build Event in the Project properties as follows:

Integration with SQL Server

Now, build your Project and locate the resultant assemblies.  The names will reflect the name you gave your assembly, but mine are:

  • SqlServerProject1.dll
  • SqlServerProject1.XmlSerializers.dll

Enabling CLR in SQL Server

By default, CLR interoperability is not enabled in SQL Server.  To enable it, run the following in SQL Server Management Studio:

USE master;
GO
EXEC sp_configure 'CLR Enabled', 1;
go
RECONFIGURE WITH OVERRIDE
GO

It’s a Matter of Trust

For security reasons, SQL CLR assemblies cannot by default be loaded into a SQL Server database.  In order to enable the uploading of our assemblies, there are two alternatives.

Setting TRUSTWORTHY ON for the Target Database

ALTER DATABASE <DBNAME> SET TRUSTWORTHY ON

This is useful for development and/or testing because it’s quick and easy.  However, it does open up the database to the possibility of malicious assemblies being loaded as all assemblies will now be trusted.  For production environments you should instead give “trust” only to your assemblies.

Trusting Individual Assemblies

CREATE ASYMMETRIC KEY CrmKey
FROM EXECUTABLE FILE = '<full path to your assembly>\<assemblyname>.dll'
GO

Now create a login associated with the CrmKey and grant it External Access rights.

CREATE LOGIN [CrmLogin] FROM ASYMMETRIC KEY [CrmKey];
GO
GRANT EXTERNAL ACCESS ASSEMBLY TO [CrmLogin];
GO

Finally add a user for your login in your database:

USE <your database>
GO
CREATE USER [CrmLogin] FOR LOGIN [CrmLogin];
GO

Deployment

Project Database Properties

When creating the SQLCLR Project in Visual Studio, we created a connection to the target SQL Server database.  On the ‘Database’ tab in the Project Properties, we can influence the deployment of our SQLCLR assembly.

Set the Permission level to ‘External’ – we need this, as our assembly will require access outside of SQL Server to communicate with CRM Web Services.

Set the Assembly Owner to either:

  • dbo – if you set TRUSTWORTHY ON in the database
  • CrmLogin – if you are assigning trust only to your assemblies and created the CrmLogin as per the instructions above

Deploying the Serialisation Assembly

The deployment process will only automatically deployment your SQLCLR assembly.  To deploy the serialisation assembly, we create a Post-Deployment script in the PostDeployScript.sql file:

CREATE ASSEMBLY [SqlServerProject1.XmlSerializers]
AUTHORIZATION [dbo] FROM '<solution path>\<project folder>\bin\Debug\SqlServerProject1.XmlSerializers.dll'
WITH PERMISSION_SET = EXTERNAL_ACCESS
GO

NB: Because I am deploying my assemblies to a SQL Server on another machine, I had to configure the post deployment script above to use a UNC path to the serialisation assembly.  This is necessary, as the SQL Server must be able to gain file system access to the file.

Build and Deploy

Now… build your Project.  Assuming it builds successfully, you can Deploy it (the Deploy process actually does a build before it Deploys, but it can’t hurt to test the build first).

To deploy:

  1. Right-click on the Project
  2. Click ‘Deploy’

If it all goes as expected, your assemblies will be uploaded into the target database and you will be able to find them in SQL Server Management Studio by:

  1. Expanding the target database
  2. Expanding ‘Programmability’
  3. Expanding ‘Assemblies’

As you can see in the image above, the Deployment process also creates a “wrapper” Stored Procedure for the SQLCLR Stored Procedure implemented in my assembly.  The definition of that Stored Procedure is as follows:

CREATE PROCEDURE [dbo].[CreateContact]
	@firstName [nvarchar](4000),
	@lastName [nvarchar](4000),
	@email [nvarchar](4000)
WITH EXECUTE AS CALLER
AS
EXTERNAL NAME [SqlServerProject1].[StoredProcedures].[CreateContact]
GO

EXEC sys.sp_addextendedproperty @name=N'SqlAssemblyFile', @value=N'CreateContact.cs' , @level0type=N'SCHEMA',@level0name=N'dbo', @level1type=N'PROCEDURE',@level1name=N'CreateContact'
GO

EXEC sys.sp_addextendedproperty @name=N'SqlAssemblyFileLine', @value=N'13' , @level0type=N'SCHEMA',@level0name=N'dbo', @level1type=N'PROCEDURE',@level1name=N'CreateContact'
GO

To call our new Stored Procedure, simply use the following:

DECLARE	@return_value int

EXEC	@return_value = [dbo].[CreateContact]
		@firstName = N'Testy',
		@lastName = N'Tester',
		@email = N'testiest@test.com'

SELECT	'Return Value' = @return_value

With the following result:

Conclusion

I hope this post makes sense.  It’s rather long, rather involved and I’ve had a really long day.  So, although I’ve read through it to make sure it isn’t obviously unintelligible I can’t guarantee that it is without issue(s).  If anyone spots anything that requires clarification and/or modification, let me know.

Enjoy.

April 1, 2011

Post CRM 2011 Upgrade from CRM 4.0 – Resolve Case Error – Allotment overage is invalid

Filed under: CRM, CRM 2011 Upgrade, SQL Server, Transact SQL — pogo69 [Pat Janes] @ 07:57

We started receiving this error whenever we tried to resolve a Case (incident) after the upgrade of our internal CRM system from 4.0 to 2011.

The exact text of the error message is as follows:

Unhandled Exception: System.ServiceModel.FaultException`1[[Microsoft.Xrm.Sdk.OrganizationServiceFault, Microsoft.Xrm.Sdk, Version=5.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]: Allotment overage is invalid.Detail:
<OrganizationServiceFault xmlns:i=”http://www.w3.org/2001/XMLSchema-instance&#8221; xmlns=”http://schemas.microsoft.com/xrm/2011/Contracts”&gt;
<ErrorCode>-2147204341</ErrorCode>
<ErrorDetails xmlns:d2p1=”http://schemas.datacontract.org/2004/07/System.Collections.Generic&#8221; />
<Message>Allotment overage is invalid.</Message>
<Timestamp>2011-03-09T14:41:20.8651607Z</Timestamp>
<InnerFault i:nil=”true” />
<TraceText i:nil=”true” />
</OrganizationServiceFault>

Which… meant absolutely nothing to me, and nothing to any of the other people suffering from the same issue in a recent thread on the MSDN CRM Forum.

To test my theory that it was only upgraded Contract Lines that cause the error, I created a new Contract and Contract Line and compared them with the pre-existing upgraded versions.  The offending data is the AllotmentsOverage column in the ContractDetailBase table; for all of the old Contract Lines in our DB post-conversion, the value of this field was NULL, which is obviously invalid (as per the error message we’re all receiving).

I ran a query to update all NULL values to 0 (a new Contract Line that I created for testing had this field set to 0):

update ContractDetailBase set AllotmentsOverage = ISNULL(AllotmentsOverage, 0);

After the update, all was well again!

February 21, 2011

SQL Server Database Maintenance for the CRM Developer

Filed under: CRM, DBMS Management — Tags: , , , , , , — pogo69 [Pat Janes] @ 14:45

Overview

SQL Server has always been relatively easy to install and configure.  The GUI tools provided make “part-time DBA” duties a reality for most developers working with Microsoft products; I’m sure this applies, at least on occasion, to almost every developer working with Dynamics CRM.

Unfortunately, the foisting of DBA duties upon unexpecting developers is not usually accompanied by commensurate training.  The ability to “point-n-click” administer a SQL Server database rarely translates into optimum performance and can often help bring a database to a standstill.

The one area in which I have found every single production CRM database lacking is index maintenance.

Index Maintenance

In order to allow you to efficiently interact with your data (CRUD operations), the SQL Server query optimiser uses index statistics to formulate an execution plan.  Indexes are stored, just like the data to which they point, in pages.  When a page is filled (or filled to a certain level dependent upon the “fill factor” which I will discuss briefly in the following section) further additions to the index will be placed in a subsequent page.

Over time the statistics pertaining to an index become stale; the data within an index becomes fragmented.  Regular Index Maintenance can be used to help alleviate these issues.

Fill Factor

When creating an Index, you are given the option of specifying the “fill factor”.  The fill factor defines “how full each leaf level page is when the index is created”.  Choosing a fill factor, appropriate to the rowsize of the index, can influence the prevalence of page splits, and how correspondingly fragmented your index will become over time.

So, while the main focus of this post is to look at how we can “de-fragment” our indexes, intelligently defining them from the outset can help ensure that the maintenance required is minimal.

For further information, explained in terms that most of us should be able to understand, and with a great deal more clarity than I am likely to be able to deliver, see the following blog posting:

Index Fill Factor & Performance Considerations

Maintenance Plans

SQL Server Management Studio (and SQL Server Enterprise Manager in SQL 2000) provide GUI Tools to visually define “Maintenance Plans”, that allow the scheduled maintenance of your SQL Server databases.  There are a number of built-in tasks available including:

  • Rebuild Index Task
  • Reorganize Index Task

Reorganising an Index, attempts to defragment the data within the Index, much like Windows Defragmentation Tool does with your file system.  Rebuilding an Index, completely rebuilds the Index anew.

These tasks allow you to rebuild the Indexes for any or all of your database, and any or all tables within each database.

At the very least, you should schedule one or both of these tasks to regularly maintain the indexes within your CRM database <OrgName_MSCRM>.  You could, for instance, schedule a nightly full reorganisation and a weekly full rebuild.  I have never yet come across a CRM implementation where at least this bare minimum level of maintenance is being performed.

Index Fragmentation

Unfortunately, these tasks allow you only to indiscriminantly reorganise/rebuild indexes.  To efficiently maintain your database, indexes should be reorganised and/or rebuilt only when they require it.  Best practises advise:

  • Index fragmentation <= 30% – REORGANIZE
  • Index fragmentation > 30% – REBUILD

To REORGANIZE an Index:

ALTER INDEX <index_name> ON <table_name> REORGANIZE;

To REBUILD and Index:

ALTER INDEX <index_name> ON <table_name> REBUILD WITH (ONLINE = ON);

NB: The reorganisation of an Index is always an online operation; that is, the reorganisation does not cause database users to be unable to access the Index during query execution.  By default however, Index rebuilds take the Index offline and as such users will be unable to access the Index for the duration of the rebuild.  The (ONLINE = ON) option allows the online rebuilding of an Index, but is only available in the Enterprise Version of SQL Server.

To discover the current level of fragmentation of an Index:

DBCC SHOWCONTIG ('<table_name>', '<index_name>');

A Home-grown Index Maintenance Plan

It was while trying to fix an issue with “out of control” SQL Server log file growth, that I came across an outstanding library of Transact SQL that can be used to programatically and intelligently reorganise and/or rebuild your indexes dependent upon their current level of fragmentation:

Index Defrag Script v4.0 – by the SQL Fool, Michelle Ufford

By default, the script will reorganise every Index with a level of fragmentation greater than 10% and rebuild every Index with greater than 30% fragmentation.

You can use this script, as is, and it will be a vast improvement on the built-in Maintenance Plan tasks.

Log File Growth and Indiscriminant Index Rebuilds

We were experiencing the above mentioned log file growth due to the use of the Maintenance Plan ‘Rebuild Index Task’ and its indiscriminantly rebuilding every index in our database.  There were some very large indexes in the database in question, but many of them rarely changed and thus, did not require rebuilding.  Index rebuilds are a logged operation and were therefore causing enormous growth of the log file; before I began my crusade to eradicate the rampant growth, the logfile for the database in question, regularly blew out to over 100GB (for a data file of approximately 120GB).

After the implementation of the new Maintenance schedule, we were able to keep the logfile consistently under 5GB.

The following steps describe the sequence of events in the new schedule:

  1. Backup the LOG
  2. Set the database to ‘Bulk Logged’ mode
  3. Run the Index Defrag Script
  4. Set the database back to ‘Full’ recovery mode
  5. Backup the LOG again
  6. Shrink the LOG

We backup the LOG file, both pre and post maintenance, due to the pre-existing hourly LOG backup schedule.  This was in place so that we could restore to within one hour of any point in time, should anything catastrophic have happened to the database.

If you would like to implement something similar in your own CRM database(s), you will need to obtain a copy of the Index Defrag Script v4.0 from SQL Fool (I use it as is from within my own scripts).  You will also need copies of the following:

Optimisation Script

This script is the “Meta Script” that calls and controls the entire maintenance process; it takes only two parameters:

  • @DATABASE – name of the database you wish to maintain
  • @logsize in MB – target size of the resultant database LOG file – you’ll have to play around a little bit to determine the optimum size for your DB, as the SHRINK operation will fail if you choose too small a number

NB: I’ve hardcode the directory into which the LOG backup files are placed; I really shouldn’t have, but… I had to leave something for others to do.

CREATE PROCEDURE [dbo].[dba_optimiseDB]
(
 @DATABASE varchar(128),
 @logsize int = 2048
)
AS
 DECLARE @RC int;
 DECLARE @minFragmentation float;
 DECLARE @rebuildThreshold float;
 DECLARE @executeSQL bit;
 DECLARE @defragOrderColumn nvarchar(20);
 DECLARE @defragSortOrder nvarchar(4);
 DECLARE @timeLimit int;
 DECLARE @tableName varchar(4000);
 DECLARE @forceRescan bit;
 DECLARE @scanMode varchar(10);
 DECLARE @minPageCount int;
 DECLARE @maxPageCount int;
 DECLARE @excludeMaxPartition bit;
 DECLARE @onlineRebuild bit;
 DECLARE @sortInTempDB bit;
 DECLARE @maxDopRestriction tinyint;
 DECLARE @printCommands bit;
 DECLARE @printFragmentation bit;
 DECLARE @defragDelay char(8);
 DECLARE @debugMode bit;
 SET @minFragmentation = 10;      -- 10%
 SET @rebuildThreshold = 30;      -- 30%
 SET @executeSQL = 1;
 SET @defragOrderColumn = 'range_scan_count';
 SET @defragSortOrder = 'DESC';
 SET @timeLimit = 120;       -- 2hrs
 SET @tableName = NULL;       -- all tables
 SET @forceRescan = 1;
 SET @scanMode = 'LIMITED';      -- LIMITED / SAMPLED / DETAILED
 SET @minPageCount = 8;
 SET @maxPageCount = NULL;
 SET @excludeMaxPartition = 0;
 SET @onlineRebuild = 1;
 SET @sortInTempDB = 1;
 SET @maxDopRestriction = NULL;
 SET @printCommands = 1;
 SET @printFragmentation = 1;
 SET @defragDelay = '00:00:05';
 SET @debugMode = 0;
 -- take pre-optimise log backup
 declare @folder nvarchar(max);
 declare @file nvarchar(max);
 set @folder = N'\\data\backups\Database Maintenance\MSCRM Database Backup\CRM Optimise\';
 set @file = @DATABASE + N'_' + dbo.fFormatDateTime(GETDATE(), 'TIMESTAMP');
 EXEC dba_backupLog @DATABASE, @folder, @file;
 -- switch to bulk logged mode
 EXEC('ALTER DATABASE ' + @DATABASE + ' SET RECOVERY BULK_LOGGED');
 -- re-index your little heart out...
 EXECUTE @RC = [dba].[dbo].[dba_indexDefrag_sp]
    @minFragmentation
   ,@rebuildThreshold
   ,@executeSQL
   ,@defragOrderColumn
   ,@defragSortOrder
   ,@timeLimit
   ,@DATABASE
   ,@tableName
   ,@forceRescan
   ,@scanMode
   ,@minPageCount
   ,@maxPageCount
   ,@excludeMaxPartition
   ,@onlineRebuild
   ,@sortInTempDB
   ,@maxDopRestriction
   ,@printCommands
   ,@printFragmentation
   ,@defragDelay
   ,@debugMode;
 --switch back to full recovery mode
 EXEC('ALTER DATABASE ' + @DATABASE + ' SET RECOVERY FULL');
 -- take post-optimise log backup
 set @file = @DATABASE + N'_' + dbo.fFormatDateTime(GETDATE(), 'TIMESTAMP');
 EXEC dba_backupLog @DATABASE, @folder, @file;
 -- shrink it
 EXEC dba_shrinkLog @DATABASE, @logsize;
GO

LOG Backup Script

CREATE PROCEDURE [dbo].[dba_backupLog]
(
 @DATABASE VARCHAR(128),
 @folder VARCHAR(MAX),
 @file VARCHAR(MAX)
)
AS

 DECLARE @logpath VARCHAR(MAX);
 SELECT @logpath = @folder + @file + '.TRN';

 BACKUP LOG @DATABASE
  TO DISK = @logpath WITH NOFORMAT,
  NOINIT,
  NAME = @file,
  SKIP,
  REWIND,
  NOUNLOAD,
  STATS = 10;

 declare @backupSetId as int;
 select
  @backupSetId = position
 from
  msdb..backupset
 where
  database_name = @DATABASE
 and
  backup_set_id =
   (
    select
     max(backup_set_id)
    from
     msdb..backupset
    where
     database_name = @DATABASE
   );
 
if @backupSetId is null
  begin
   declare @error varchar(max);
   set @error = N'Verify failed. Backup information for database ''' + @DATABASE + ''' not found.';
   raiserror(@error, 16, 1);
  end;

 RESTORE VERIFYONLY
  FROM DISK = @logpath WITH FILE = @backupSetId,
  NOUNLOAD,
  NOREWIND;

GO

Shrink LOG Script

CREATE PROCEDURE [dbo].[dba_shrinkLog]
(
 @DATABASE nvarchar(128),
 @size int = 2048
)
AS
 EXEC
 (
  'USE ' + @DATABASE + ';' +
  'declare @log_name varchar(max);' +
  'select @log_name = [name] from sys.database_files where type_desc = ''LOG'';' +
  'DBCC SHRINKFILE(@log_name, ' + @size + ');'
 );

GO 

Summary

I create a standalone database ‘DBA’ to house these and other database maintenance objects.  The results of the Index Defrag Script are stored in tables created as part of the installation of the scripts you can download from SQL Fool.  It can educational to see just which Indexes are being affected in what way.

Create a free website or blog at WordPress.com.