Selecting a Random Set of Rows from a Table
Moderator: NorbertKrupa
Selecting a Random Set of Rows from a Table
Hi guys,
Is there a way that I can select a random set of rows from a table? For instance, if I have a million rows in a table I want to create another table that contains a random subset of 100,000 rows from the original table. I want it to be random so that I can keep repopulating the smaller table when I want a new set of data.
Thank you!
Is there a way that I can select a random set of rows from a table? For instance, if I have a million rows in a table I want to create another table that contains a random subset of 100,000 rows from the original table. I want it to be random so that I can keep repopulating the smaller table when I want a new set of data.
Thank you!
Thanks,
Juliette
Juliette
Re: Selecting a Random Set of Rows from a Table
Hi Julie!
I can't promise exact amount of rows in random sample, but some banal solution I can give for your problem:
I can't promise exact amount of rows in random sample, but some banal solution I can give for your problem:
- Table:
Code: Select all
CREATE TABLE public.RandomSample ( id IDENTITY , rand float );
- Count rows:
Code: Select all
dbadmin=> select count(*) from RandomSample ; count ---------- 10000000 (1 row)
- Query
Code: Select all
dbadmin=> select * from RandomSample where RANDOMINT(2) limit 25; id | rand ----+----------------- 1 | 0.435220889408 2 | 0.526619048598 7 | 0.503683776469 8 | 0.156843165799 9 | 0.287958949609 11 | 0.729640057579 12 | 0.995287269836 13 | 0.903783922664 14 | 0.0783629890101 15 | 0.842216552691 21 | 0.182495597212 25 | 0.186746456018 26 | 0.143420991338 29 | 0.349558490354 30 | 0.741160671611 33 | 0.52479647246 34 | 0.384557384151 35 | 0.460069846527 37 | 0.787678300055 40 | 0.615458029295 41 | 0.648441913875 45 | 0.866885151712 47 | 0.808360821028 48 | 0.148068713781 49 | 0.426903519349 (25 rows)
- Filter: where RANDOMINT(2) equals to where RANDOMINT(2) = 1 or where RANDOMINT(2)::BOOLEAN = TRUE
- Sample: LIMIT 25 promise to us that output will be equals or less than this number
Code: Select all
dbadmin=> select * from RandomSample where RANDOMINT(2) and (RandomSample.id) % 3 < 1 limit 25;
id | rand
-----+-----------------
3 | 0.806599489745
6 | 0.861293215119
9 | 0.287958949609
21 | 0.182495597212
27 | 0.991428367272
33 | 0.52479647246
36 | 0.306133418106
39 | 0.0674794158243
45 | 0.866885151712
48 | 0.148068713781
51 | 0.648667960266
60 | 0.444064909126
66 | 0.947957454543
69 | 0.444615772056
75 | 0.510743846082
81 | 0.836552352283
87 | 0.624822871063
90 | 0.95282113079
99 | 0.0967488669246
102 | 0.710242643209
105 | 0.893509417363
111 | 0.0513069083741
114 | 0.363351906321
117 | 0.635888079515
120 | 0.58747541997
(25 rows)
-
- Beginner
- Posts: 36
- Joined: Fri Feb 17, 2012 6:09 pm
- Contact:
Re: Selecting a Random Set of Rows from a Table
Try this:
Code: Select all
select A.*
from (
select random() as RandomNumber, *
from <TableNameGoesHere> ) as A
order by RandomNumber
limit 100000;
Re: Selecting a Random Set of Rows from a Table
Just4Fun
(@doug_harmon gave good solution)
Over 1mil rows function randomint(1000000) % 10 = 7 gives better random distribution than sort by random() and returns about 100K rows.
(@doug_harmon gave good solution)
Over 1mil rows function randomint(1000000) % 10 = 7 gives better random distribution than sort by random() and returns about 100K rows.
Code: Select all
dbadmin=> \d floats
List of Fields by Tables
Schema | Table | Column | Type | Size | Default | Not Null | Primary Key | Foreign Key
--------+--------+--------+-------+------+---------+----------+-------------+-------------
public | floats | id | int | 8 | | t | f |
public | floats | f | float | 8 | | f | f |
(2 rows)
Code: Select all
dbadmin=> select count(*) from floats;
count
---------
1000000
(1 row)
Code: Select all
dbadmin=> select count(*) from floats where randomint((select count(*) from floats))%10 = 7;
count
--------
100178
(1 row)
Code: Select all
dbadmin=> select count(*) from floats where randomint((select count(*) from floats))%10 = 7;
count
--------
100117
(1 row)
Code: Select all
dbadmin=> select count(*) from floats where randomint((select count(*) from floats))%10 = 7;
count
--------
100136
(1 row)
Re: Selecting a Random Set of Rows from a Table
sKwa, thanks for your response! Although, maybe I didn't make myself clear.
Say I have a table TABLE1 with these values:
For instance:
I hope this makes more sense. Again, thanks for your help!
Say I have a table TABLE1 with these values:
- Red
Blue
Green
Black
Purple
Orange
For instance:
- Green
Purple
Orange
- Blue
Black
Orange
I hope this makes more sense. Again, thanks for your help!
Thanks,
Juliette
Juliette
Re: Selecting a Random Set of Rows from a Table
Hi Julie!
I will try:
I will try:
- Table:
Code: Select all
test_db=> create table Colors (colour varchar(10)); CREATE TABLE
- Data:
Code: Select all
test_db=> copy Colors from stdin direct; Enter data to be copied followed by a newline. End with a backslash and a period on a line by itself. >> Green >> Blue >> Black >> Orange >> Magenta >> Yellow >> #FFCCBB >> White >> #DCDCDC >> \.
- Table of samples:
Code: Select all
create table if not exists ColorsRand as select colour from Colors where RANDOMINT(2); CREATE TABLE
- Show result:
Code: Select all
test_db=> select * from Colors left outer join ColorsRand on Colors.colour = ColorsRand.colour; colour | colour ---------+--------- #DCDCDC | #FFCCBB | #FFCCBB Black | Black Blue | Blue Green | Magenta | Orange | White | Yellow | (9 rows)
- One more time:
Code: Select all
truncate table ColorsRand; insert into ColorsRand select colour from Colors where RANDOMINT(2); select * from Colors left outer join ColorsRand on Colors.colour = ColorsRand.colour;
Code: Select all
TRUNCATE TABLE OUTPUT -------- 5 (1 row) colour | colour ---------+--------- #DCDCDC | #FFCCBB | #FFCCBB Black | Black Blue | Blue Green | Green Magenta | Magenta Orange | White | Yellow | (9 rows)
Last edited by id10t on Wed Jun 13, 2012 10:37 pm, edited 2 times in total.
Re: Selecting a Random Set of Rows from a Table
SELECT INTO
But in second time it will not work:
So the same way: TRUNCATE -> INSERT ... RANDOM FILTER
Code: Select all
test_db=> drop table ColorsRand ;
DROP TABLE
test_db=> select * into ColorsRand from Colors where RANDOMINT(2);
CREATE TABLE
test_db=> select * from Colors left outer join ColorsRand on Colors.colour = ColorsRand.colour;
colour | colour
---------+---------
#DCDCDC |
#FFCCBB |
Black |
Blue | Blue
Green |
Magenta | Magenta
Orange | Orange
White | White
Yellow | Yellow
(9 rows)
Code: Select all
test_db=> select * into ColorsRand from Colors where RANDOMINT(2);
ROLLBACK 4213: Object "ColorsRand" already exists