dupmerge - : Value too large for defined data type

First of all, heaps of gratitude for the excellent contribution, tremendously helpful tool.
I really like the way you keep my mdadm raids in order, compared to some of the mangling I see under other "rescue disks" (actually rw-writing the preferred minor?)
What brought me to Grml was the fact that you include the dupmerge tool out of the box, but in running it over a large set of video files (around 5 TB at the moment, ext3 filesystem) I'm getting tons of the above error messages on large files (some are up to 16GB). Having such a tool fail on the largest files kind of defeats the purpose doesn't it 8-)
I see you're running v1.70, while Sourceforge lists v1.73 as current (as of 2008!), and I see references on the interwebs to v1.74, so consider this if nothing else a request to update.
If it's not difficult it's also be great to be able to check out another one or two similar deduplication tools unless you've already checked them all out and consider dupmerge2 the best?
Here's a good overview:
http://www.asheesh.org/note/software/duplicate-files.html
My vote would be for freedup and rdfind.
Thanks for your consideration

On Sun, 06 Feb 2011 22:59:54 +0700, hansbkk wrote:
Here's a good overview:
which he picked rdfind as the winner.
My vote would be for freedup and rdfind.
+1 for rdfind, which was included in sid not long ago.
I looked at its algorithm and speed comparison -- it is definitely a winner.

* T o n g mlist4suntong@yahoo.com [Mon Feb 07, 2011 at 12:42:26AM +0000]:
On Sun, 06 Feb 2011 22:59:54 +0700, hansbkk wrote:
Here's a good overview:
which he picked rdfind as the winner.
My vote would be for freedup and rdfind.
+1 for rdfind, which was included in sid not long ago.
I looked at its algorithm and speed comparison -- it is definitely a winner.
Therefore I just added rdfind to the software selection of GRML_FULL. Thanks for reporting back!
regards, -mika-

Thanks much Mika, a great (and quick!) response.
I've been doing some scripting based on dupmerge, so please also let us know whether that will be updated or replaced by rdfind.
2011/2/7 Michael Prokop mika@grml.org:
- T o n g mlist4suntong@yahoo.com [Mon Feb 07, 2011 at 12:42:26AM +0000]:
On Sun, 06 Feb 2011 22:59:54 +0700, hansbkk wrote:
which he picked rdfind as the winner.
+1 for rdfind, which was included in sid not long ago.
Therefore I just added rdfind to the software selection of GRML_FULL. Thanks for reporting back!
regards, -mika-

* hansbkk@gmail.com wrote [09.02.11 20:22]:
I've been doing some scripting based on dupmerge, so please also let us know whether that will be updated or replaced by rdfind.
rdfind is already included in grml-full. It should be also in the daily images. Im not sure if dupmerge will be dropped but it is still included in Grml.
Ulrich
participants (4)
-
hansbkk@gmail.com
-
Michael Prokop
-
T o n g
-
Ulrich Dangel