Databases and procedural programming - part 1
Tuesday, September 29, 2009
dba : the code is just an UDB translation of the 'whatever' script written in some 'whatever' interpreted language programmer : well, that was my requirement, was asked to convert this into an db2 SP, dba : whats the cardinality of this cursor ? p : 72 million d : (ok, today I did wake up in front of the mirror) and you think this is good? p : that was the work set in the file for that script d : do you know how that handled the file and memory? and do you know this is going to sort all 72 million rows? p : so what? DB2 has been touted as best thing next to bread, should be a breeze, now you help me speed this up, is this locked? d : (can't find a gun) hands out the 'thinking in sets' by joe celko can you please read this when free? (should contact Colbert, well there ain't colberts in the nerd domain ) p : ok, but promise you'll speed this procedure d worked a week and came up with a procedure that worked in sets and wanted to advise this to the procedural DB programmer (along with the link to the famous 'kiss my royal irish a**' scene from 25th hour) 1. Databases are logical mappings of data, not physical, I've even been asked to run an update over one page to the next in the table, and table being called a file 2. projections and other db concepts rose out of set theory, so please think in sets, 3. break your work into finer pieces, the database and tables are not at your disposal Once I was called to inspect a 'database slowness' for an SP. Below is a snippet. declare cur1 cursor for select create_dt, client_cd,order_line_num from tab1 order by client_cd, create_dt ; fetch cur1 into v_create_dt,v_client_cd,v_order_line_num; if (completed(v_order_line_num,v_client_cd) = 1 ) then do the processing for the client and for the particular line number; . . end if; set old_v_client_cd = v_client_cd; set old_v_order_line_num = v_order_line_num; while((fetch cur1 into v_client_cd,v_order_line_num) = (old_v_client_cd,old_v_order_line_num)) do end while; This sql was ordering 72 million rows and picking up the max date for a particular order line . Obviously the programmer wasn't updating his knowledge on the presence of the 'partition by' clause, so I re-wrote update target_table1 set full_amt = (select sum(amt) from table1 t1 where row-number() over(partition by client_cd, order_line_num order by create_dt desc) = 1) Simple and powerful single query to tackle a whole pseudocode. That is the power of set processing. Viola! the sp completed in 6 minutes compared to 4 hours and stopping the rest of the processes meanwhile. I could go on and on about the 'legacy' data programmers, and 'architects'. I am not gifted to working with people like james koopman or Cunningham to have a great insight or implement architectures for the programs. But when I face these kinds of crap from programmers and team leads of over 10 years experience, I question whether it is their complacency or lack of the 'push' for improving their knowledge in the domain they work. Regards to DB2, this has come a long way from a few blogs/a great info center to a thousand webpages, groups, conferences etc.