Discussion:
Good checksum component, anyone?
(too old to reply)
Gorm Braarvig
2008-12-20 15:53:33 UTC
Permalink
Hi!

I currently use conesans checksum component, but I need a md5-checksum
pushed into a bigint (I believe conesans uses crc-32).
Has anyone solved this?

regards,
Gorm Braarvig
Russell Fields
2008-12-22 21:59:13 UTC
Permalink
Gorm,

Starting with SQL Server 2005 there is a HASHBYTES keyword that allows you
to choose from a set of methods and returns VARBINARY. It can be cast into
BIGINT, but it is not reversable.

The MD5 method returns 16 bytes, but BIGINT is only 8 bytes.

See the output of the following:

SELECT HASHBYTES('MD5','Gorm Braarvig') AS BINMD5,
CAST (HASHBYTES('MD5','Gorm Braarvig') AS BIGINT) AS BIGINTMD5,
CAST (CAST (HASHBYTES('MD5','Gorm Braarvig') AS BIGINT)
AS VARBINARY(8000)) AS BINAGAINMDF

Results:
0x647679B1F3F515655EC3B478951307CF 6828499890013931471 0x5EC3B478951307CF

Does that help?

RLF
Post by Gorm Braarvig
Hi!
I currently use conesans checksum component, but I need a md5-checksum
pushed into a bigint (I believe conesans uses crc-32).
Has anyone solved this?
regards,
Gorm Braarvig
Gorm Braarvig
2008-12-25 17:36:23 UTC
Permalink
Hi Russel!

Can't use it, need to do it in the flow. I was thinking of merging the 16
bytes into 8 by xoring the two Int64 together, would be sufficient for most
(all?) applications.

Thanks,
Gorm
Post by Russell Fields
Gorm,
Starting with SQL Server 2005 there is a HASHBYTES keyword that allows you
to choose from a set of methods and returns VARBINARY. It can be cast
into BIGINT, but it is not reversable.
The MD5 method returns 16 bytes, but BIGINT is only 8 bytes.
SELECT HASHBYTES('MD5','Gorm Braarvig') AS BINMD5,
CAST (HASHBYTES('MD5','Gorm Braarvig') AS BIGINT) AS BIGINTMD5,
CAST (CAST (HASHBYTES('MD5','Gorm Braarvig') AS BIGINT)
AS VARBINARY(8000)) AS BINAGAINMDF
0x647679B1F3F515655EC3B478951307CF 6828499890013931471
0x5EC3B478951307CF
Does that help?
RLF
Post by Gorm Braarvig
Hi!
I currently use conesans checksum component, but I need a md5-checksum
pushed into a bigint (I believe conesans uses crc-32).
Has anyone solved this?
regards,
Gorm Braarvig
Russell Fields
2008-12-25 19:56:31 UTC
Permalink
Gorm,

Sorry, I don't know what to say. It strikes me as odd to feel it necessary
to use MD5, then to XOR the first 8 bytes on the second 8 bytes. I would
think that that certainly has to do something negative to the hash.

Maybe one of these will be useful to you. Google: MD5 Component

http://xstandard.com/en/documentation/xmd5/
http://www.download.com/MD5-Com-Component/3000-2383_4-10449255.html
http://www.traction-software.co.uk/devmd5/index.html

RLF
Post by Gorm Braarvig
Hi Russel!
Can't use it, need to do it in the flow. I was thinking of merging the 16
bytes into 8 by xoring the two Int64 together, would be sufficient for
most (all?) applications.
Thanks,
Gorm
Post by Russell Fields
Gorm,
Starting with SQL Server 2005 there is a HASHBYTES keyword that allows
you to choose from a set of methods and returns VARBINARY. It can be
cast into BIGINT, but it is not reversable.
The MD5 method returns 16 bytes, but BIGINT is only 8 bytes.
SELECT HASHBYTES('MD5','Gorm Braarvig') AS BINMD5,
CAST (HASHBYTES('MD5','Gorm Braarvig') AS BIGINT) AS BIGINTMD5,
CAST (CAST (HASHBYTES('MD5','Gorm Braarvig') AS BIGINT)
AS VARBINARY(8000)) AS BINAGAINMDF
0x647679B1F3F515655EC3B478951307CF 6828499890013931471
0x5EC3B478951307CF
Does that help?
RLF
Post by Gorm Braarvig
Hi!
I currently use conesans checksum component, but I need a md5-checksum
pushed into a bigint (I believe conesans uses crc-32).
Has anyone solved this?
regards,
Gorm Braarvig
Gorm Braarvig
2008-12-25 22:15:41 UTC
Permalink
Greetings,

thanks for the swift reply and the links, but I am not closer to a solution,

what konesans check sum component does is: you can put a component in your
data flow and map up columns, etc. and get out a nice checksum you can add
to you data, later you compare the checksum to your previous checksum to
evaluate if any of the fields you care about has changed. Good idea.

however...

- the component, while having the right interface has an algorithm that
gives
the same checksum to different data, because it (if I guess correctly) uses
crc-32.

- md5 is overkill for my use, so I don't want to store a binary field or two
bigint's.
I want to use an algorithm that gives me a 64-bit checksum and I thought
using md5 and then combining it to half the size was a nice strategy (no???)

- If I was to implement this myself, (which I am capable of, but can't do,
since the structure of the projects and politics keeps me away from it), I
would simply use .NET crypto routines in a custom component. If this was
infeasable, I would attempt to use Microsoft Base Cryptographic Provider,
which I have had exposure
to in a previous gig...

my task is unsolved: a delta detection mechanism that works "good enough"
(2^32 is not, 2^64 is) while not taking up unnecessary space.

I wonder: has anyone else solved this in the data flow (in an easy way
without using a script component)?

thanks,
Gorm
Post by Russell Fields
Gorm,
Sorry, I don't know what to say. It strikes me as odd to feel it
necessary to use MD5, then to XOR the first 8 bytes on the second 8 bytes.
I would think that that certainly has to do something negative to the
hash.
Maybe one of these will be useful to you. Google: MD5 Component
http://xstandard.com/en/documentation/xmd5/
http://www.download.com/MD5-Com-Component/3000-2383_4-10449255.html
http://www.traction-software.co.uk/devmd5/index.html
RLF
Post by Gorm Braarvig
Hi Russel!
Can't use it, need to do it in the flow. I was thinking of merging the 16
bytes into 8 by xoring the two Int64 together, would be sufficient for
most (all?) applications.
Thanks,
Gorm
Post by Russell Fields
Gorm,
Starting with SQL Server 2005 there is a HASHBYTES keyword that allows
you to choose from a set of methods and returns VARBINARY. It can be
cast into BIGINT, but it is not reversable.
The MD5 method returns 16 bytes, but BIGINT is only 8 bytes.
SELECT HASHBYTES('MD5','Gorm Braarvig') AS BINMD5,
CAST (HASHBYTES('MD5','Gorm Braarvig') AS BIGINT) AS BIGINTMD5,
CAST (CAST (HASHBYTES('MD5','Gorm Braarvig') AS BIGINT)
AS VARBINARY(8000)) AS BINAGAINMDF
0x647679B1F3F515655EC3B478951307CF 6828499890013931471
0x5EC3B478951307CF
Does that help?
RLF
Post by Gorm Braarvig
Hi!
I currently use conesans checksum component, but I need a md5-checksum
pushed into a bigint (I believe conesans uses crc-32).
Has anyone solved this?
regards,
Gorm Braarvig
Continue reading on narkive:
Loading...