If you are looking for some sample data to play around with check out my post here.
The following describes how I generally create sample data, in this case using the GO SALES database from IBM and getting the Order tables up to approximately 133 million records.
Here we are going to multiply the number of orders. Two tables are involved, ORDER_HEADER and ORDER_DETAIL and we want to maintain the relationship or ORDER_NUMBER between the two tables.
select count(*) as cnt From GOSALES.order_header; CNT ----- 53267 select count(*) as cnt From GOSALES.order_details; CNT ------ 446023
Create our target tables
create table gosales.test_order_details as (select * from GOSALES.order_details) with no data; create table gosales.test_order_header as (select * from GOSALES.order_header) with no data;
We will use the data that is already in these two tables, but ‘duplicate’ it with new order numbers, thus maintaining the relationship of the two tables. Additionally we will change the Order Dates.
First find the maximum order number
select max(order_number) as last_order_number from gosales.order_header; LAST_ORDER_NUMBER ----------------- 834936
For Order Numbers we will add 1000000 to each order number so they remain unique and thus conform to the tables Primary Key.
For Order Dates we will use todays date and subtract a random amount of time, and for order close data we will do the same, except add a random amount of time.
Our SQL looks like this.
insert into gosales.test_order_header ( select order_number + 001000000 , retailer_name , retailer_name_mb , retailer_site_code , retailer_contact_code , sales_staff_code , sales_branch_code , current_timestamp - ( RAND () *604800+ 1 ) seconds as order_date , current_timestamp + ( RAND () *604800 + 1 ) seconds as order_close_date , order_method_code from gosales.order_header );
We now have 53267 records in our test table.
A sample showing the new order_number and dates.
select order_number, retailer_name, order_date, order_close_date from test_order_header fetch first 1 rows only with UR; ORDER_NUMBER RETAILER_NAME ORDER_DATE ORDER_CLOSE_DATE ------------ --------------- ------------------- ------------------- 1100001 Kavanagh Sports 2015-09-30 14:19:08 2015-10-04 13:12:43
Repeat for as many as you like while incrementing the Order_number augmentation
insert into gosales.test_order_header ( select order_number + 002000000, retailer_name , insert into gosales.test_order_header ( select order_number + 003000000, retailer_name , .... insert into gosales.test_order_header ( select order_number + 300000000, retailer_name ,
Here I repeated the above 300 times going up to order_number + 300000000 and have 15,980,100 Orders records in my test order header table.
Now is a good time to run RUNSTATS and create an index on your new test order header table.
C:\Program Files\IBM\SQLLIB\BIN>db2 connect to gsdb user db2admin Enter current password for db2admin: Database Connection Information Database server = DB2/NT64 10.5.1 SQL authorization ID = DB2ADMIN Local database alias = GSDB C:\Program Files\IBM\SQLLIB\BIN>db2 runstats on table gosales.test_order_header DB20000I The RUNSTATS command completed successfully. C:\Program Files\IBM\SQLLIB\BIN>db2 create index i01 on gosales.test_order_header (order_number) DB20000I The SQL command completed successfully.
To get our Order Detail records join the existing Order detail table to our new test order header table. Here adding a random part-of-day to make up a ship date
insert into gosales.test_order_details ( select a.order_detail_code , b.order_number , b.order_close_date + ( RAND () *86400 + 1 ) seconds , a.product_number , a.promotion_code , a.quantity , a.unit_cost , a.unit_price , a.unit_sale_price from gosales.order_details a inner join test_order_header b on b.order_number = a.order_number + 001000000 ) ;
Repeat this to match the ORDER_HEADER table and you end up with some reasonably sized data to play with.
Here I’ve repeated it 300 times to match the 300x I used to generate data for the order header giving me the following
select count(*) as cnt from test_order_header; CNT -------- 15980100 select count(*) as cnt from gosales.test_order_details CNT --------- 133806900
Run runstats and add an index to this one as well
db2 runstats on table gosales.test_order_details db2 create index i02 on gosales.test_order_details (order_number)
For some added volume you may want to look at creating a denormalized table combining the two tables plus other tables from the GoSales Database.
create view gosales.order_denorm_v as ( select h.order_number ,h.retailer_name ,h.retailer_name_mb ,h.retailer_site_code ,h.retailer_contact_code ,h.sales_staff_code ,h.sales_branch_code ,b.city as sales_city ,b.prov_state as sales_prov_state ,b.postal_zone as sales_postal_zone ,b.country_code as sales_country_code ,c.country_en as sales_country ,h.order_date ,h.order_close_date ,h.order_method_code ,o.order_method_en ,d.order_detail_code ,d.ship_date ,d.product_number ,p.product_image ,d.promotion_code ,d.quantity ,d.unit_cost ,d.unit_price ,d.unit_sale_price from gosales.test_order_header h inner join gosales.test_order_details d on h.order_number= d.order_number inner join gosales.order_method o on o.order_method_code = h.order_method_code inner join gosales.product p on p.product_number = d.product_number inner join gosales.branch b on b.branch_code = h.sales_branch_code inner join gosales.country c on b.country_code = c.country_code ) ; create table gosales.order_denorm as ( select * from gosales.order_denorm_v ) with no data ; insert into gosales.order_denorm (select * from gosales.order_denorm_v) ;