Sunday, September 9, 2007

CSV File Creator Procedure

There is often a need to create comma separated values (CSV) files from an Oracle query, whether it be a whole table or just a subset. Many database utilities like Toad or Oracle SQL Developer offer such functionality. In fact, our application has this built in using Java. But what if you want to create a file using sqlplus? Some of our support staff asked for something like this so I put something together for them and thought I would share. I borrowed generously from Tom Kyte's PRINT_TABLE procedure, but just added some additional parameters to save to a delimited file. To make this work, you will need a directory that can be accessed with UTL_FILE. One that you've defined in your utl_file_dir system parameter or an Oracle directory you've created. The procedure defaults to a directory defined as CSV_DIR.

CREATE DIRECTORY CSV_DIR AS 'C:\TEMP';

So, if you are using an Oracle directory use CSV_DIR (or whatever you've defined for your directory name) or a directory you've defined in utl_file_dir (i.e. /tmp).

Here is the code:


create or replace
procedure create_csv
( p_query in varchar2,
p_file_name in varchar2,
p_dir_name in varchar2 default 'CSV_DIR',
p_delimiter in varchar2 default ',',
p_date_fmt in varchar2 default 'dd-mon-yyyy hh24:mi:ss' )

-- this utility is designed to be installed ONCE in a database and used
-- by all. Also, it is nice to have roles enabled so that queries by
-- DBA's that use a role to gain access to the DBA_* views still work
-- that is the purpose of AUTHID CURRENT_USER
AUTHID CURRENT_USER
is
l_theCursor integer default dbms_sql.open_cursor;
l_columnValue varchar2(4000);
l_status integer;
l_descTbl dbms_sql.desc_tab;
l_colCnt number;
l_cs varchar2(255);
l_date_fmt varchar2(255);
l_line_cols long;
l_line_vals long;
l_file UTL_FILE.File_Type;


-- small inline procedure to restore the sessions state
-- we may have modified the cursor sharing and nls date format
-- session variables, this just restores them
procedure restore
is
begin
if ( upper(l_cs) not in ( 'FORCE','SIMILAR' ))
then
execute immediate
'alter session set cursor_sharing=exact';
end if;
if ( p_date_fmt is not null )
then
execute immediate
'alter session set nls_date_format=''' || l_date_fmt || '''';
end if;
dbms_sql.close_cursor(l_theCursor);
end restore;
begin
-- I like to see the dates print out with times, by default, the
-- format mask I use includes that. In order to be "friendly"
-- we save the date current sessions date format and then use
-- the one with the date and time. Passing in NULL will cause
-- this routine just to use the current date format
if ( p_date_fmt is not null )
then
select sys_context( 'userenv', 'nls_date_format' )
into l_date_fmt
from dual;


execute immediate
'alter session set nls_date_format=''' || p_date_fmt || '''';
end if;


-- to be bind variable friendly on this ad-hoc queries, we
-- look to see if cursor sharing is already set to FORCE or
-- similar, if not, set it so when we parse -- literals
-- are replaced with binds
if ( dbms_utility.get_parameter_value
( 'cursor_sharing', l_status, l_cs ) = 1 )
then
if ( upper(l_cs) not in ('FORCE','SIMILAR'))
then
execute immediate
'alter session set cursor_sharing=force';
end if;
end if;


-- parse and describe the query sent to us. we need
-- to know the number of columns and their names.
dbms_sql.parse( l_theCursor, p_query, dbms_sql.native );
dbms_sql.describe_columns
( l_theCursor, l_colCnt, l_descTbl );


-- define all columns to be cast to varchar2's, we
-- are just printing them out
for i in 1 .. l_colCnt loop
if ( l_descTbl(i).col_type not in ( 113 ) )
then
dbms_sql.define_column
(l_theCursor, i, l_columnValue, 4000);
-- seperate column names by a comma
if(i =1) then
l_line_cols:=l_descTbl(i).col_name;
else
l_line_cols:=l_line_cols||p_delimiter||l_descTbl(i).col_name;
end if;
end if;
end loop;

-- execute the query, so we can fetch
l_status := dbms_sql.execute(l_theCursor);


--open file
l_file := UTL_FILE.FOPEN(p_dir_name,p_file_name,'w');

-- print column headers
UTL_FILE.PUT_LINE(l_file,l_line_cols);

while ( dbms_sql.fetch_rows(l_theCursor) > 0 )
loop
l_line_vals:='';
for i in 1 .. l_colCnt
loop
if ( l_descTbl(i).col_type not in ( 113 ) )
then
dbms_sql.column_value
( l_theCursor, i, l_columnValue );
-- seperate column values by a comma
if(i =1) then
l_line_vals:=l_columnValue;
else
l_line_vals:=l_line_vals||p_delimiter||l_columnValue;
end if;
end if;
end loop;
--print line values
UTL_FILE.PUT_LINE(l_file,l_line_vals);
end loop;

-- close file
UTL_FILE.FCLOSE(l_file);

-- now, restore the session state, no matter what
restore;
exception
when others then
restore;
UTL_FILE.FCLOSE(l_file);
raise;
end;
/

As with PRINT_TABLE, I'm avoiding BLOB columns. So, if they are included in the query, they will be ignored.

Here it is in action:

SQL> exec create_csv('select * from emp','emp.csv','CSV_DIR');

SQL> host type C:\temp\emp.csv

EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO
7369,SMITH,CLERK,7902,17-dec-1980 00:00:00,800,,20
7499,ALLEN,SALESMAN,7698,20-feb-1981 00:00:00,1600,300,30
7521,WARD,SALESMAN,7698,22-feb-1981 00:00:00,1250,500,30
7566,JONES,MANAGER,7839,02-apr-1981 00:00:00,2975,,20
7654,MARTIN,SALESMAN,7698,28-sep-1981 00:00:00,1250,1400,30
7698,BLAKE,MANAGER,7839,01-may-1981 00:00:00,2850,,30
7782,CLARK,MANAGER,7839,09-jun-1981 00:00:00,2450,,10
7788,SCOTT,ANALYST,7566,19-apr-1987 00:00:00,3000,,20
7839,KING,PRESIDENT,,17-nov-1981 00:00:00,5000,,10
7844,TURNER,SALESMAN,7698,08-sep-1981 00:00:00,1500,0,30
7876,ADAMS,CLERK,7788,23-may-1987 00:00:00,1100,,20
7900,JAMES,CLERK,7698,03-dec-1981 00:00:00,950,,30
7902,FORD,ANALYST,7566,03-dec-1981 00:00:00,3000,,20
7934,MILLER,CLERK,7782,23-jan-1982 00:00:00,1300,,10

Simple as that.

No comments: