Row Counts for Tables Discrepancies

Moderator: NorbertKrupa

id10t
GURU
GURU
Posts: 732
Joined: Mon Apr 16, 2012 2:44 pm

Re: Row Counts for Tables Discrepancies

Post by id10t » Thu Oct 03, 2013 9:22 pm

Hi!

Approve if Im right:

1. Fact has no dups and `upd_flag` = {'U', 'I'} (ideally)
2. Stage has no dups
3. If join key
- MATCH then <f_inventory.maint_date = stage_f_inventory.maint_date> and <f_inventory.upd_flag = 'U'>
- NOT MATCH then <f_inventory.maint_date = stage_f_inventory.maint_date> and <f_inventory.upd_flag = 'I'>


from my understanding, query should be like follow:

Code: Select all

UPDATE non_sales.f_inventory 
   SET maint_date = non_sales.stage_f_inventory.maint_date, 
       upd_flag = 'U'
  FROM non_sales.stage_f_inventory
 WHERE (non_sales.f_inventory.week_gen_key = non_sales.stage_f_inventory.week_gen_key)  
   AND (non_sales.f_inventory.item_gen_key = non_sales.stage_f_inventory.item_gen_key)
   AND (non_sales.f_inventory.store_gen_key = non_sales.stage_f_inventory.store_gen_key);
PS

Code: Select all

SET back_order_u ... WHERE... AND back_order_u IS NOT NULL ;
Query asks for rebuilding(IMHO you need correlated sub-query with EXISTS)

Code: Select all

SET col... WHERE EXISTS (SELECT col_i, col_j FROM stage_tbl as stg WHERE trg_col_i = stg.col_i)
Last edited by id10t on Thu Oct 03, 2013 9:34 pm, edited 1 time in total.

clv100
Beginner
Beginner
Posts: 27
Joined: Tue Oct 16, 2012 6:04 pm

Re: Row Counts for Tables Discrepancies

Post by clv100 » Thu Oct 03, 2013 9:31 pm

Yes. That is the way it is set up to work. The table has 32 updateable columns and the last updates are to set the maint_date and upd_flag('U') where the keys match.The next part of the scripts us where INSERT is done from the stage f_inventory columns to f_inventory with the upd_flag = 'I' and the maint_date = current_date where there are NO matching keys.

The problem is that I am seeing dupes and I am getting "data integrity violations" on a delete - I believe I am seeing and counting records that are being marked for deletion when an update occurs. I can actually list them at some point. I had 36 rows as shown above this morning when I checked for the duplicates - when I deleted in the afternoon I only deleted 12 rows (count returned from SQL) but my table count was 36 less.

id10t
GURU
GURU
Posts: 732
Joined: Mon Apr 16, 2012 2:44 pm

Re: Row Counts for Tables Discrepancies

Post by id10t » Thu Oct 03, 2013 10:08 pm

1. Try EXISTS
2. EXPLAIN depends on statistics and since I have no your data, my EXPLAIN doesn't representative. Can you post EXPLAIN for some UPDATE? (better explain for update of maint_date and upd_flag columns)
3. What are table types ? temp or base / segmented or replicated? I can't reproduce your problem.

>> "data integrity violations" on a delete
Without script I can't help.

User avatar
nnani
Master
Master
Posts: 302
Joined: Fri Apr 13, 2012 6:28 am
Contact:

Re: Row Counts for Tables Discrepancies

Post by nnani » Fri Oct 04, 2013 7:33 am

For delete, you can try this

Code: Select all

 delete from non_sales.f_inventory 
 where exists ( select null from non_sales.f_inventory
 where store_gen_key = 500920
 and week_gen_key = 52013035
 and item_gen_key in (9000767644,9000767738,9000767929,9000767453,9000767466,9000767479,9000767563,9000773447,9000776101,9000767602,9000767628,9000776169) sub;
This should show you a count of 36 when deleting from stage table

Hope this helps.. :)
nnani........
Long way to go

You can check out my blogs at vertica-howto

clv100
Beginner
Beginner
Posts: 27
Joined: Tue Oct 16, 2012 6:04 pm

Re: Row Counts for Tables Discrepancies

Post by clv100 » Fri Oct 04, 2013 3:27 pm

Thanks - can you tell me what the select "null" and the sub at the end of the SQL statement mean??

User avatar
nnani
Master
Master
Posts: 302
Joined: Fri Apr 13, 2012 6:28 am
Contact:

Re: Row Counts for Tables Discrepancies

Post by nnani » Mon Oct 07, 2013 8:47 am

#clv100,
can you tell me what the select "null" and the sub at the end of the SQL statement mean??
null - It sets up the Exists status as true if there are records in the select query OR It prevents to bring data when I don't want data, but just the exist status
Check this link http://blog.sqlauthority.com/2010/02/21 ... rom-table/

sub - It is just an alias for the sub-query used in the query

Hope this helps.. :)
nnani........
Long way to go

You can check out my blogs at vertica-howto

clv100
Beginner
Beginner
Posts: 27
Joined: Tue Oct 16, 2012 6:04 pm

Re: Row Counts for Tables Discrepancies

Post by clv100 » Thu Oct 24, 2013 8:18 pm

Just as an update:

Per Vertica, we ended up backing the table up(verifying that there were no dupes first), dropped and created the table and then copied from the backup back to the table . This was supposed to solve the problem but the next day there were a few "new" dupes. According to Vertica, there is a known bug that randomly can cause this kind of issue BUT I re-did the SQL script update code to use a WHERE EXISTS and we have not had the dupes issue again. It was very strange because it was ONLY occurring for ONE particular store - but there was nothing different with the data for that store which comes in the same format as all the other stores.

Post Reply

Return to “New to Vertica Database Administration”