Weird behavior with RANDOMINT

Moderator: NorbertKrupa

User avatar
heather
Newbie
Newbie
Posts: 22
Joined: Fri Dec 28, 2012 3:01 pm

Weird behavior with RANDOMINT

Post by heather » Tue Sep 24, 2013 2:17 pm

Hi,

I am experiencing some really weird behavior with the RANDOMINT function.

I have a table named test_date which has two columns. A date_key which is just a number representing a date (YYYYMMDD) and a column called rn (row number).

Code: Select all

dbadmin=> select * from test_date order by 1, 2;
 date_key |  rn  
----------+------
 19200101 |    1
 19200102 |    2
 19200103 |    3
 19200104 |    4
 19200105 |    5
 19200106 |    6
 19200107 |    7
 19200108 |    8
 19200109 |    9
 19200110 |   10
...
 19991221 | 29210
 19991222 | 29211
 19991223 | 29212
 19991224 | 29213
 19991225 | 29214
 19991226 | 29215
 19991227 | 29216
 19991228 | 29217
 19991229 | 29218
 19991230 | 29219
 19991231 | 29220
 20000101 | 29221

Code: Select all

dbadmin=> select min(date_key), max(date_key), min(rn), max(rn) from test_date;
   min    |   max    | min |  max  
----------+----------+-----+-------
 19200101 | 20000101 |   1 | 29221
(1 row)
I want to select a random date_key from the table. I tried this:

Code: Select all

dbadmin=> select date_key, rn from test_date where rn = (1 + randomint(29221));
 date_key |  rn   
----------+-------
 19790214 | 21595
(1 row)
It works most of the time, but sometimes I get more than one row:

Code: Select all

dbadmin=> select date_key, rn from test_date where rn = (1 + randomint(29221));
 date_key |  rn   
----------+-------
 19240302 |  1523
 19900613 | 25732
(2 rows)
And sometimes I get no rows:

Code: Select all

dbadmin=> select date_key, rn from test_date where rn = (1 + randomint(29221));
 date_key | rn 
----------+----
(0 rows)
How is that possible?

There are no duplicates:

Code: Select all

dbadmin=> select date_key, count(*) from test_date group by date_key having count(*) > 1;
 date_key | count 
----------+-------
(0 rows)

dbadmin=> select rn, count(*) from test_date group by rn having count(*) > 1;
 rn | count 
----+-------
(0 rows)
Can someone explain what is happening?

Thanks! :? :? :?

User avatar
JimKnicely
Site Admin
Site Admin
Posts: 1825
Joined: Sat Jan 21, 2012 4:58 am
Contact:

Re: Weird behavior with RANDOMINT

Post by JimKnicely » Tue Sep 24, 2013 2:38 pm

Hi Heather,

Try encapsulating the the RANDOMINT function into its own SELECT:

Like this:

Code: Select all

select date_key from test_date where rn = (select (1 + randomint(29221)));
Does that help?
Jim Knicely

Image

Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.

User avatar
heather
Newbie
Newbie
Posts: 22
Joined: Fri Dec 28, 2012 3:01 pm

Re: Weird behavior with RANDOMINT

Post by heather » Tue Sep 24, 2013 4:04 pm

Thanks, Jim! That seemed to work... but not when it is part of an update statement.

Code: Select all

dbadmin=> select * from rand_test;
 id | rn 
----+----
  1 |   
  2 |   
  3 |   
(3 rows)

dbadmin=> update rand_test set rn = randomint(10);
 OUTPUT 
--------
      3
(1 row)

dbadmin=> select * from rand_test;
 id | rn 
----+----
  1 |  5
  2 |  2
  3 |  1
(3 rows)
That update worked, but not this one:

Code: Select all

dbadmin=> update rand_test set rn = (select randomint(10));
 OUTPUT 
--------
      3
(1 row)

dbadmin=> select * from rand_test;
 id | rn 
----+----
  1 |  9
  2 |  9
  3 |  9
(3 rows)

dbadmin=> update rand_test set rn = (select randomint(10));
 OUTPUT 
--------
      3
(1 row)

dbadmin=> select * from rand_test;
 id | rn 
----+----
  1 |  8
  2 |  8
  3 |  8
(3 rows)
I get the same "random" number for every row?

id10t
GURU
GURU
Posts: 732
Joined: Mon Apr 16, 2012 2:44 pm

Re: Weird behavior with RANDOMINT

Post by id10t » Tue Sep 24, 2013 4:13 pm

Hi!

It's interesting.
I have only 2 ideas: table has mutable(not stable) default value in definitions or Vertica's reduce function work not properly.

1. Can you post a table definition? I will do a test on my system.
2. What is Vertica version?
3. Can you post PROFILE somewhere (pastebin for example)? I want to check query parallelism.

PS
If query parallelized so 2 answers probably comes from separate streams("reduce" didn't filter one answer). And if "reduce" work not properly so it can be reason for "No answer".

User avatar
JimKnicely
Site Admin
Site Admin
Posts: 1825
Joined: Sat Jan 21, 2012 4:58 am
Contact:

Re: Weird behavior with RANDOMINT

Post by JimKnicely » Tue Sep 24, 2013 4:14 pm

Heather,

That sub query will only get evaluated once (first) for every row you are updating. Try correlating it back to the table you are updating.

Like this:

Code: Select all

update rand_test a set rn = (select randomint(10) from rand_test b where b.id = a.id);
Does that work for you? That should cause the sub query to be evaluated for each row you are updating.
Jim Knicely

Image

Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.

User avatar
heather
Newbie
Newbie
Posts: 22
Joined: Fri Dec 28, 2012 3:01 pm

Re: Weird behavior with RANDOMINT

Post by heather » Tue Sep 24, 2013 4:19 pm

sKwa,

Here are the details:

Code: Select all

dbadmin=> select version();
              version               
------------------------------------
 Vertica Analytic Database v6.1.2-0
(1 row)

Code: Select all

dbadmin=> \dt rand_test;
                 List of tables
 Schema |   Name    | Kind  |  Owner  | Comment 
--------+-----------+-------+---------+---------
 public | rand_test | table | dbadmin | 
(1 row)

Code: Select all

dbadmin=> \d rand_test;
                                  List of Fields by Tables
 Schema |   Table   | Column | Type | Size | Default | Not Null | Primary Key | Foreign Key 
--------+-----------+--------+------+------+---------+----------+-------------+-------------
 public | rand_test | id     | int  |    8 |         | f        | f           | 
 public | rand_test | rn     | int  |    8 |         | f        | f           | 
(2 rows)

Code: Select all

dbadmin=> profile update rand_test a set rn = (select randomint(10));
NOTICE 4788:  Statement is being profiled
HINT:  Select * from v_monitor.execution_engine_profiles where transaction_id=45035996273714273 and statement_id=41;
NOTICE 3557:  Initiator memory for query: [on pool general: 293851 KB, minimum: 293851 KB]
NOTICE 5077:  Total memory required by query: [293851 KB]
 OUTPUT 
--------
      3
(1 row)

User avatar
heather
Newbie
Newbie
Posts: 22
Joined: Fri Dec 28, 2012 3:01 pm

Re: Weird behavior with RANDOMINT

Post by heather » Tue Sep 24, 2013 4:23 pm

Jim,

I just saw your reply. That did work! But it seems weird to me.

Code: Select all

dbadmin=> update rand_test a set rn = (select randomint(10) from rand_test b where b.id = a.id);
 OUTPUT 
--------
      3
(1 row)

dbadmin=> select * from rand_test;
 id | rn 
----+----
  1 |  0
  2 |  8
  3 |  9
(3 rows)
Hmm. Is there some sort of hint we can pass in the sub-query to say "please evaluate me for every row". I think that exists in Oracle.

Thanks for all your help guys. This has been driving me crazy for days!!!

Post Reply

Return to “New to Vertica Database Development”