CloverETL Forum

CloverETL Engine, Designer & Server related discussion forums
It is currently Thu Sep 09, 2010 5:22 am



Post new topic Reply to topic  [ 3 posts ] 
Author Message
 Post subject: URGENT: remove duplicate records and sorting in DEDUP??
PostPosted: Sat Aug 30, 2008 12:35 am 
Offline

Joined: Thu Mar 20, 2008 7:19 pm
Posts: 71
Hi,

How do i remove duplicate records without specifying all the fields? my record has a metadata of 2000 fields...
here is a subset of my input data, sorted by REFERENCE (primary key):

"REFERENCE","NAME","NO"
"000000010271 ","WFB ","1"
"000000010271 ","WFB ","1"
"000000010272 ","ABC ","1"
"000000010272 ","ABC ","2"

i want an output result like this:

"REFERENCE","NAME","NO"
"000000010271 ","WFB ","1" (removed the duplicate)
"000000010272 ","ABC ","1"
"000000010272 ","ABC ","2"

i know i can use DEDUP and set the dedupKey="REFERENCE;NAME;NO" to achieve my output, but if my input data has 2000 fields, i do not want to set dedupKey to 2000 fields, right? moreover, can dedupKey be set to such a long string? so, is there a way to tell CloverETL to remove duplicate records if i have 2000 fields to match?

i would think DEDUP would just need a flag, say remove_only_if_all_fields_matches, set to true and can reference the FMT for the list of fields... if values of each respective fields match, then it's a duplicate and remove it... that way, DEDUP would not need the dedupKey to be set to a large number of field names... right?

just to make sure, DEDUP does not sort the records, right?

any help would be greatly appreciated Smile

thanks,
al


Top
 Profile  
 
 Post subject: Re: URGENT: remove duplicate records and sorting in DEDUP??
PostPosted: Thu Dec 17, 2009 11:32 am 
Offline

Joined: Fri Jul 20, 2007 9:28 am
Posts: 552
Hello,
I've created the new issue with your request (http://bug.cloveretl.com/view.php?id=3401).
And answer about sorting: Dedup doesn't sort records, but it expects, that records are sorted according the key fields. If not, it only deduplicates records for each group of records that have the same key in sequence input.

_________________
Agata Vackova
Javlin a.s.
agata.vackova@javlin.eu


Top
 Profile  
 
 Post subject: Re: URGENT: remove duplicate records and sorting in DEDUP??
PostPosted: Thu Dec 17, 2009 3:23 pm 
Offline

Joined: Mon Feb 23, 2009 4:21 pm
Posts: 40
Hello Achan,

You can click inside the left pane of the Edit key dialog (Fields pane), then click Ctrl+A (after which all Fields will be selected and turned blue) and click the Right arrow key.

This way all the fields will be moved to the pane on the right. You only need to confirm this by clicking OK.

Before this, you should have done the same in the ExtSort component.

I think this is what you wanted.

Best regards,

Tomas Waller

_________________
Tomas Waller
Javlin, a.s.
wallert@mail.javlin.cz


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 3 posts ] 


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group