Implicit queries: Implicit conversions using UDTs

I've written a CLR UDT that basically wraps a COM FILETIME. While using and testing this UDT, I've come across some unusual behavior that I can't seem to track down. My question is this:

When I make the T-SQL statements

DECLARE @.ft dbo.MyFileTime
SET @.ft = 0x0123456789ABCDEF

What's happening within the SET statement?

My first surprise was that it succeeded. Then I thought, "The hex value, I understand, is implicitly a BINARY or a VARBINARY. Underneath, with those attributes, MyFileTime is also implicitly a Binary or a Varbinary. Maybe it's simply performing a direct assignment?"

MyFileTime is prefaced by the attributes:

[SqlUserDefinedTypeAttribute(Format.Native, IsByteOrdered = true, IsFixedLength = true]
[Serializable]
[StructLayout(LayoutKind.Sequential)]

One thing that definitely seems to be the case is that SET is not implicitly calling any code that I wrote. Also, the apparent binary representation of the stored value is not what I would expect it to be. If I manipulate @.ft via my properties, everything seems to work properly, but my use case doesn't allow me to do that in its context.

So, again, exactly what happens within the SET statement?

(The weird behavior has been exhibited on multiple versions of the beta, including the September CTP. The output methods I'm using to verify values (ToString(), e.g.) have been tested and verified with the precise values in question external to SqlServer. I'm reasonably confident that I'm not dealing with a simple logic error there.)

-Greg
I'm not entirely sure what problem you are facing, but I'll give it a shot. Let me know if this isn't what you were looking for.
You were right the first time when you thought that the Set command just performs a direct assignment - none of your UDT code is called for a Format.Native UDT. If your UDT was a Format.UserDefined UDT, then the BinaryReader code would have to run to instantiate it from the bytes.
I think the solution in your case is to define a UDT Validation Method. This validation method will be called whenever a UDT is defined directly by bytes (rather than going through the Parse method or something else) and you can use it to ensure that the bytes form a valid UDT.|||Hi, Steven, thanks for the reply. I apologize if my initial post was a bit obtuse. Reading back, it looks like I neglected to call out the precise issue I was having.
MyFileTime is wrapping two ints, dwHigh and dwLow, in that order. Potentially key to our discussion is that these are signed ints, not unsigned ints. When I attempt to assign a MyFileTime directly
SET @.ft = 0x0123456789ABCDEF
and then retrieve that value, the value I retrieve is:
0x8123456709ABCDEF
IE, the sign bits are toggled. That was the unexpected thing.
Right now I'm accessing the site from a different location, so I unfortunately can't provide the precise T-SQL statements that I was using to get these values.
Short version, then:
SET @.ft = 0x0123456789ABCDEF
-- Something like: PRINT cast(@.ft as binary(8))
0x8123456709ABCDEF
SET @.ft.DwHigh = 0x01234567
SET @.ft.DwLow = 0x89ABCDEF
-- I don't remember if this is req'd for type safety:
-- SET @.ft.DwLow = -1985229329 -- aka, 0x89ABCDEF
-- Something like: PRINT cast(@.ft as binary(8)), (IE, same output statement as above)
0x0123456789ABCDEF
The DwHigh and DwLow properties don't do anything exciting, it's a straight assignment to the underlying data.
Is it possible that the sign bits for those underlying ints are being toggled in the course of the direct assignment?
|||

Hi Greg, your concern makes a lot more sense now!

I understand your confusion and I believe I have an answer for you, but I'm actually seeing the reverse behavior from you. Can you check again to make sure? If your results are the same, can you attach your UDT source code?

Here is what I am seeing that hopefully explains things:

declare @.ft filetime
SET @.ft = 0x0123456789ABCDEF
select cast(@.ft as binary(8)), @.ft.ToString()
--The result matches exactly what was put in
--0x0123456789ABCDEF -2128394905,162254319
SET @.ft.dwHigh = 0x01234567
SET @.ft.dwLow = 0x89ABCDEF
select @.ft.dwHigh, cast(@.ft.dwHigh as binary(4))
--This also matches exactly what was put in
--19088743 0x01234567
select @.ft.dwLow, cast(@.ft.dwLow as binary(4))
--This also matches exactly what was put in
---1985229329 0x89ABCDEF
select cast(@.ft as binary(8)), @.ft.ToString()
--This output seems to be wrong, however the correct numbers are returned...
--0x8123456709ABCDEF 19088743,-1985229329

So, the question becomes why is setting the UDT binary to 0x0123456789ABCDEF not the same as setting the respective fields to 0x01234567 and 0x89ABCDEF?

The answer is Normalization. When SQL Server stores the UDT as a binary, it does so in such a way so that it can sort by the binary value (this is the IsByteOrdered=true attribute) rather than having to instantiate the UDT. It can't compare them in the regular Int binary format because as you can see with this data, 0x01234567 is less than 0x89ABCDEF when compared by binary values, but yet 19088743 is much greater than -1985229329.

Let me know if this compares with what you are seeing.

|||Hi, Steven!
My implementation of ToString() is a bit different, so instead of @.ft.ToString(), I tested this as follows:
DECLARE @.ft MyFiletime
SET @.ft = 0x0123456789ABCDEF
select cast(@.ft as binary(8)), @.ft.DwHigh, @.ft.DwLow
-- The binary result matches exactly what was put in
-- 0x0123456789ABCDEF
-- The decimal results appear to have their sign bits toggled
-- -2128394905,162254319
SET @.ft.DwHigh = 0x01234567
SET @.ft.DwLow = 0x89ABCDEF
select @.ft.DwHigh, cast(@.ft.DwHigh as binary(4))
-- This matches exactly what was put in
-- 19088743 0x01234567
select @.ft.DwLow, cast(@.ft.DwLow as binary(4))
-- This also matches exactly what was put in
-- -1985229329 0x89ABCDEF
select cast(@.ft as binary(8)), @.ft.DwHigh, @.ft.DwLow
--This output seems to be wrong, however the correct numbers are returned...
--0x8123456709ABCDEF 19088743,-1985229329
My results are the same as yours, so the normalization due to the IsByteOrdered property does seem to be the culprit.
I'd like to note that the first SELECT's decimal values, in both your and my samples, do have the sign bits toggled again, presumably because the normalization takes place between the MyFiletime CLR code and Sql Server's storage of the value. It seems, then, that the surprise is occurring because we're sort-of violating the MyFiletime interface by accessing the underlying bits. Of course, I don't need to say that in our particular use-case this is something of a necessity.
See, I find this behavior of the IsByteOrdered property to be unexpected. When I read the documentation for it while writing this UDT (which was, admittedly, some 6-7 months ago), I had the impression that "IsByteOrdered" would be a relatively "dumb" property. I didn't expect, and (apparently) didn't want, Sql Server and the CLR to interpret the semantics of my underlying types. Rather, I expected it to simply look at the byte representation and sort on it as it was. IE, I interpreted "IsByteOrdered" to be a "trust me, I know what I'm doing" property. With the set of attributes I used on this UDT, I didn't expect the internal byte representation of a MyFiletime to be any different between a byte array, a pair of ints, a pair of uints, a single long, and a single ulong.
Regarding solutions, then, my first thought is that converting those ints into uints ought to provide the desired behavior. It seems that the normalization is taking place because of the semantic meaning of a signed integral value.
If I correct the internal semantics of MyFiletime and make dwHigh and dwLow uints, would you also expect that to result in my desired behavior?
|||Hi Greg,
The serialization format does not actually depend on the IsByteOrdered=true attribute, I was just referencing it for the reason why SQL serializes the UDT in this way. It will be the same serialization whether IsByteOrdered=true or false.
The real reason you're not getting the results you expect is because you specified a Format.Native UDT, which is saying to SQL Server that you want SQL Server to serialize the UDT for you so that you don't have to deal with it.
If your use case depends on you dealing with the whole UDT on a byte level, then you may want to consider implementing it as a Format.UserDefined UDT, where you control the serialization completely. In this case, IsByteOrdered is a "trust me" attribute as SQL Server cannot verify that your serialization is in-fact Byte-Ordered.
Using Unsigned Integers internally in this case will also result in the serialization you expected because Unsigned Ints are naturally byte-ordered, but in general you should use a Format.UserDefined UDT if you require a specific serialization.
Did this clear things up?|||That clears it up nicely, Steven. Thanks again for the info!
As an aside, it may be useful to call out this normalization process in the BOL. I'm sure I'm not the only person who'll make this mistake. Smile

-Greg
|||My co-worker Isaac just posted an entry to the SQL CLR blog about this issue which hopefully should help other people with similar problems.
Check out http://blogs.msdn.com/sqlclr/ for the entry and feel free to comment there with any feedback or suggestions for future posts.

Sunday, February 19, 2012

Implicit conversions using UDTs

No comments:

Post a Comment

Implicit queries

Blog Archive

About Me