SQLite Extraction of Oracle Tables Tools, Methods and Pitfalls
Introduction
The SQLite database is a wildly successful and ubiquitous software package that is mostly unknown to the larger IT community. Designed and coded by Dr. Richard Hipp, the third major revision of SQLite serves many users in market segments with critical requirements for software quality, which SQLite has met with compliance to the DO-178B avionics standard. In addition to a strong presence in aerospace and automotive, most major operating system vendors (including Oracle, Microsoft, Apple, Google, and RedHat) include SQLite as a core OS component.
There are a few eccentricities that may trip up users from other RDBMS environments. SQLite is known as a “flexibly-typed” database, unlike Oracle which rigidly enforces columnar datatypes; character values can be inserted into SQLite columns that are declared integer without error (although check constraints can strengthen SQLite type rigidity, if desired). While many concurrent processes are allowed to read from a SQLite database, only one process is allowed write privilege at any time (applications requiring concurrent writers should tread carefully with SQLite). There is no network interface, and all connections are made through a filesystem; SQLite does not implement a client-server model. There is no “point in time recovery,” and backup operations are basically an Oracle 7-style ALTER DATAFILE BEGIN BACKUP
that makes a transaction-consistent copy of the whole database. GRANT
and REVOKE
are not implemented in SQLite, which uses filesystem permissions for all access control. There are no background processes, and newly-connecting clients may find themselves delayed and responsible for transaction recovery, statistics collection, or other administrative functions that are quietly performed in the background in this “zero-administration database.” Some history and architecture of SQLite can be found in audio and video records of Dr. Hipp's discussions.
Despite these eccentricities, SQLite is likely a superior format for data exchange as opposed to CSV, XML, or even JSON, as indexes can be included, enabling recipients to perform high-speed queries in SQL92 without any preprocessing, licensing, or activation. SQLite’s conservative coding style and commentary is intended to benefit “future programmers who are not yet born,” and the on-disk database format has further been defined as a long-term storage standard by the Library of the U.S. Congress.
Three methods will be presented to copy Oracle database tables into SQLite: CSV exchange, ODBC database links, and PHP scripting. The CSV exchange can be done with existing tooling and a POSIX shell, without any administrative intervention beyond an accessible account with target data. The unixODBC method requires an administrator to compile, install and configure SQLite ODBC data sources, then attach to them with Oracle's dg4odbc
, the Oracle server ODBC interface. The PHP scripting approach allows LOB types to be moved, but a compiler and development environment may be necessary to prepare it.
Extraction of Oracle database content into SQLite is somewhat problematic, as the basic data types do not quite align. Here I will present tools to coerce alignment, and extract with proper transformation.
It is assumed that the user has access to an Oracle database and Oracle's sqlplus
command-line tool, and knowledge of their use (SQLcl can also be used).
SQLite safety
SQLite is focused on developers who compile the database source code into their own applications, and the system-level tools appear to suffer from some neglect. The command-line utility for administering databases, sqlite3
, is offered for Linux in the tools collection. The latest releases present compelling new features, but there are sufficient problems with the compiled binary that I advise against its use.
- The utility is compiled only for 32-bit Linux x86 environments and relies on shared libraries, which presents two problems:
- Many modern AMD64/x86_64 Linux distributions do not include 32-bit
libc
or other dependent libraries by default, requiring administrative access to install. - 32-bit Linux allows a program to use a maximum of 3 gigabytes of memory, which may unreasonably restrict sorts and caching when far more physical RAM is available that could be used in 64-bit mode.
- Many modern AMD64/x86_64 Linux distributions do not include 32-bit
- The
sqlite3
distributed in the tools collection does not have any compiler safety features enabled, as reported by thehardening-check
utility that can be found in EPEL.
These are the hardening check results for the Linux tools utility distributed by SQLite:
$ hardening-check sqlite3; file sqlite3 sqlite3: Position Independent Executable: no, normal executable! Stack protected: no, not found! Fortify Source functions: no, only unprotected functions found! Read-only relocations: no, not found! Immediate binding: no, not found! sqlite3: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 4.3.0, stripped
I suggest that the source download (currently sqlite-autoconf-3360000.tar.gz) be obtained, and prepared on modern operating systems with current compiler safety controls:
CFLAGS='-O3 -D_FORTIFY_SOURCE=2 -fstack-protector-strong -fpic -pie' \ LDFLAGS='-Wl,-z,relro,-z,now -Wl,-z,now' ./configure
Assuming this is done, all of the compiler security controls will be enabled:
$ hardening-check sqlite3; file sqlite3 sqlite3: Position Independent Executable: yes Stack protected: yes Fortify Source functions: yes (some protected functions found) Read-only relocations: yes Immediate binding: yes sqlite3: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=34ffa395a5f985b16f6ac2e6c7c3ad5126e05bed, not stripped
SQLite is infrastructure, and though we hope it to be invulnerable, it should adhere to modern compiler safety. Any compiler that can avail itself of these options, should.
DDL
SQLite allows the following basic data affinities to be defined for the columns of a table. The rows that compose the columns can be NULL
or any of these types excepting NUMERIC:
- TEXT - text string, stored using the database encoding (UTF-8, UTF-16BE or UTF-16LE).
- BLOB - blob of data, stored exactly as it was input.
- NUMERIC - generic number, attempts to devolve to integer or real.
- INTEGER - signed, stored in 1, 2, 3, 4, 6, or 8 bytes depending on the magnitude of the value.
- REAL - floating point value, stored as an 8-byte IEEE-754 format.
These types are not a perfect fit for Oracle.
The Oracle NUMBER
type is used to hold both integers and real numbers, depending upon the use of the “data scale.” Oracle's CHAR
and VARCHAR2
map to the TEXT
type. The DATE
type is not directly supported in SQLite, and we treat it as an INTEGER
for now.
The SQLITE numeric affinity does not actually exist, upon examination of the source code. Only the first few characters of a column datatype are checked; if it is not recognized, it is assigned the numeric type. To demonstrate:
$ sqlite3 SQLite version 3.36.0 2021-06-18 18:36:39 Enter ".help" for usage hints. Connected to a transient in-memory database. Use ".open FILENAME" to reopen on a persistent database. sqlite> CREATE TABLE foo(bar razzamataz,ch text,fl real,whl int,mess blob); sqlite> .dump foo PRAGMA foreign_keys=OFF; BEGIN TRANSACTION; CREATE TABLE foo(bar razzamataz,ch text,fl real,whl int,mess blob); COMMIT; sqlite> create table bar as select * from foo where 1=0; sqlite> .dump bar PRAGMA foreign_keys=OFF; BEGIN TRANSACTION; CREATE TABLE bar( bar NUM, ch TEXT, fl REAL, whl INT, mess ); COMMIT;
The CREATE TABLE AS SELECT
(CTAS) operation above calls the C function createTableStmt
, which extracts column affinities. The unknown type “razzamataz” was silently assigned numeric, and the blob type was omitted completely. The NUM keyword in the bar table is also not recognized. This is the most direct method that I have found in SQLite to determine a column's real data affinities.
The actual SQLite column affinities that are defined in the source code are:
** Substring | Affinity ** -------------------------------- ** 'INT' | SQLITE_AFF_INTEGER ** 'CHAR' | SQLITE_AFF_TEXT ** 'CLOB' | SQLITE_AFF_TEXT ** 'TEXT' | SQLITE_AFF_TEXT ** 'BLOB' | SQLITE_AFF_BLOB ** 'REAL' | SQLITE_AFF_REAL ** 'FLOA' | SQLITE_AFF_REAL ** 'DOUB' | SQLITE_AFF_REAL ** If none of the substrings in the above table are found, ** SQLITE_AFF_NUMERIC is returned.
Oracle's binary types (LONG, BLOB, RAW, etc.) can only be moved with a more powerful tool, such as PHP, discussed later in this document. Any other exotic Oracle types are left as an exercise for the reader.
Below is a table DDL (Data Definition Language) extractor script for Oracle sqlplus
that can be used to create compatible tables in SQLite:
$ cat textract_ddl.sql set pages 50000 lin 32767 verify off heading off feedback off newpage none variable own varchar2(128); variable nam varchar2(128); execute :own := upper('&1'); execute :nam := upper('&2'); select null, 'create table ' || :nam || ' (' from dual; select sep, substr(col,1,cl) col, dtype, num, case when lag(num) over (order by num) = num - 1 then null else '*' end chk from ( select case when COLUMN_ID = 1 then null else ',' end sep, COLUMN_NAME col, 'TEXT --' dtype, COLUMN_ID num, (select max(length(COLUMN_NAME)) from ALL_TAB_COLUMNS where OWNER = :own and TABLE_NAME = :nam) cl from ALL_TAB_COLUMNS where OWNER = :own and TABLE_NAME = :nam and DATA_TYPE IN ('CHAR', 'VARCHAR', 'VARCHAR2') --'CLOB', 'NCLOB' ... union all select case when COLUMN_ID = 1 then null else ',' end sep, COLUMN_NAME col, case when data_scale = 0 then 'INT --' when data_scale <> 0 then 'REAL --' else 'NUM --' end dtype, COLUMN_ID num, (select max(length(COLUMN_NAME)) from ALL_TAB_COLUMNS where OWNER = :own and TABLE_NAME = :nam) cl from ALL_TAB_COLUMNS where OWNER = :own and TABLE_NAME = :nam and DATA_TYPE = 'NUMBER' union all select case when COLUMN_ID = 1 then null else ',' end sep, COLUMN_NAME col, 'INT --date' dtype, COLUMN_ID num, (select max(length(COLUMN_NAME)) from ALL_TAB_COLUMNS where OWNER = :own and TABLE_NAME = :nam) cl from ALL_TAB_COLUMNS where OWNER = :own and TABLE_NAME = :nam and DATA_TYPE = 'DATE' order by num ) union all select null, ');', null, null, null from dual;
To demonstrate a failure condition when using this script, we can use it against the ALL_VIEWS
view which can usually be seen by any connected user. This view contains a LONG
column that will require a more powerful client, such as PHP.
Attempting to run the script in either sqlplus
or SQLcl
results in a flagged (and skipped) column:
SQL> start textract_ddl sys all_views create table all_views ( OWNER TEXT -- 1 * , VIEW_NAME TEXT -- 2 , TEXT_LENGTH NUM -- 3 , TYPE_TEXT_LENGTH NUM -- 5 * , TYPE_TEXT TEXT -- 6 , OID_TEXT_LENGTH NUM -- 7 , OID_TEXT TEXT -- 8 , VIEW_TYPE_OWNER TEXT -- 9 , VIEW_TYPE TEXT -- 10 , SUPERVIEW_NAME TEXT -- 11 , EDITIONING_VIEW TEXT -- 12 , READ_ONLY TEXT -- 13 );
The flagged columns above, numbers 1 and 5, indicate a failure of the LAG window function on that row. Column 1 will always be flagged in any run, but any other flags indicate an unhandled datatype (in this case, a LONG
in column 4).
A more subtle challenge is the lack of a specified data_scale
on integer values. A run against the ALL_OBJECTS
view below exhibits the problem:
SQL> start textract_ddl sys all_objects create table all_objects ( OWNER TEXT -- 1 * , OBJECT_NAME TEXT -- 2 , SUBOBJECT_NAME TEXT -- 3 , OBJECT_ID NUM -- 4 , DATA_OBJECT_ID NUM -- 5 , OBJECT_TYPE TEXT -- 6 , CREATED INT --date 7 , LAST_DDL_TIME INT --date 8 , TIMESTAMP TEXT -- 9 , STATUS TEXT -- 10 , TEMPORARY TEXT -- 11 , GENERATED TEXT -- 12 , SECONDARY TEXT -- 13 );
The two NUM columns above actually contain integers, not floating point numbers. This problem usually appears when a column is defined as a NUMBER
in Oracle without a precision and scale. It may be useful to see a few examples of the various Oracle numeric types:
SQL> create table testab(n1 number(5,3), n2 number (5,0), n3 number(*,0), n4 number(5), n5 number); Table created. SQL> select column_name, data_type, data_precision, data_scale from user_tab_columns where table_name='TESTAB'; COLUMN_NAME DATA_TYPE DATA_PRECISION DATA_SCALE ----------- --------- -------------- ---------- N1 NUMBER 5 3 N2 NUMBER 5 0 N3 NUMBER 0 N4 NUMBER 5 0 N5 NUMBER SQL> insert into testab values(3.141592653589, 2.718281828, 1.414, 22/7, (1+sqrt(5))/2); 1 row created. SQL> select * from testab; N1 N2 N3 N4 N5 ---------- ---------- ---------- ---------- ---------- 3.142 3 1 3 1.61803399 SQL> start textract_ddl fishecj testab create table testab ( N1 REAL -- 1 * , N2 INT -- 2 , N3 INT -- 3 , N4 INT -- 4 , N5 NUM -- 5 );
If you are sure that an Oracle number column contains integers, even though the scale is not specified, change the column type manually.
We can use the SQLite command line shell to create a database and add the ALL_OBJECTS
table as-is, including the NUM
columns (note that the shell tool does not return confirmation messages):
$ sqlite3 orasys.db3 SQLite version 3.36.0 2021-06-18 18:36:39 Enter ".help" for usage hints. sqlite> create table all_objects ( ...> OWNER TEXT -- 1 * ...> , OBJECT_NAME TEXT -- 2 ...> , SUBOBJECT_NAME TEXT -- 3 ...> , OBJECT_ID NUM -- 4 ...> , DATA_OBJECT_ID NUM -- 5 ...> , OBJECT_TYPE TEXT -- 6 ...> , CREATED INT --date 7 ...> , LAST_DDL_TIME INT --date 8 ...> , TIMESTAMP TEXT -- 9 ...> , STATUS TEXT -- 10 ...> , TEMPORARY TEXT -- 11 ...> , GENERATED TEXT -- 12 ...> , SECONDARY TEXT -- 13 ...> );
For very large databases, it is helpful to maximize the SQLite page size. This will slightly reduce storage overhead, and result in a smaller file size for otherwise identical tables:
sqlite> pragma page_size=65536; sqlite> vacuum;
SQLite has also implemented Write Ahead Logging which has a number of benefits, but also introduces limitations. If WAL mode is desired, enable it with the PRAGMA journal_mode=WAL;
command.
At this point, we assume that our SQLite tables are now defined within our database file, and are ready to receive extracted data from Oracle.
sqlite> .quit
DML
The principal problem in this section is the expression of dates and times.
As mentioned in previous discussion of data affinities, there are no native date or time formats in SQLite. Conversion functions exist for dates expressed in the following SQLite datatypes:
- TEXT as ISO8601 strings ("YYYY-MM-DD HH:MM:SS.SSS").
- REAL as Julian day numbers, the number of days since noon in Greenwich on November 24, 4714 B.C. according to the proleptic Gregorian calendar.
- INTEGER as UNIX Epoch Time, the number of seconds since 1970-01-01 00:00:00 UTC.
While the REAL
format above is likely closest to Oracle's internal representation, the UNIX epoch format is smaller and has conversion functions in Oracle that are better described. We can determine the limits of 32-bit UNIX epoch time with a small gawk script:
$ cat strftime.gawk #!/bin/gawk -f BEGIN { for(i=1;i<ARGC;i++) print strftime("%Y/%m/%d %H:%M:%S",ARGV[i]) }
An older version of gawk on a 32-bit platform allowed negative epoch timestamps, going back to December of 1901 (note these times are expressed relative to the local time zone, in this case CST):
$ ./strftime.gawk -2147483648 0 2147483647 1901/12/13 14:45:52 1969/12/31 18:00:00 2038/01/18 21:14:07
Newer versions of gawk do not allow negative epoch timestamps (oddly enough, modern Busybox awk does not share this limitation):
$ ./strftime.gawk -2147483648 gawk: ./strftime.gawk:2: fatal: strftime: second argument less than 0 or too big for time_t
The maximum times that can be expressed by gawk and easily carried to Oracle on a 64-bit system appear to be:
$ ./strftime.gawk 0 253402322399 67767976233554395 1969/12/31 18:00:00 9999/12/31 23:59:59 2147483647/12/31 23:59:52
This situation is mostly documented in the Year 2038 wiki.
Oracle has a website with conversion functions for translating the Oracle DATE
type to UNIX epoch time. Note that I have adjusted the min/max times, the lower limit to results above, and the upper limit to the very last second of the year 9999, as this is safe in 64-bit UNIX epoch time. I present a modified version of this conversion function below, that returns the min/max values on overflow rather than an error:
CREATE OR REPLACE FUNCTION date_to_unixts(oracle_date IN DATE) RETURN NUMBER IS unix_epoch DATE := TO_DATE('19700101000000','YYYYMMDDHH24MISS'); max_date DATE := TO_DATE('99991231235959','YYYYMMDDHH24MISS'); min_date DATE := TO_DATE('19011213194552','YYYYMMDDHH24MISS'); unix_ts NUMBER(38,0); BEGIN IF oracle_date > max_date THEN return(253402322399); ELSIF oracle_date < min_date THEN return(-2147483648); ELSE unix_ts := (oracle_date - unix_epoch) / (1/86400); END IF; RETURN (unix_ts); END; /
Once this function is in place, we are ready to build a SELECT
that will pull all the target columns from an Oracle table, but specifically converting the date types to UNIX epoch format. To do this, a SELECT
that explicitly invokes the converter on the DATE
columns is required:
$ cat textract_dml.sql set pages 50000 lin 32767 verify off heading off feedback off newpage none variable own varchar2(128); variable nam varchar2(128); execute :own := upper('&1'); execute :nam := upper('&2'); select null, 'select' from dual; select sep, substr(col,1,cl) col, com, num from ( select decode(COLUMN_ID, 1, '', ',') sep, COLUMN_NAME col, '--' com, COLUMN_ID num, (select max(length(COLUMN_NAME))+16 from ALL_TAB_COLUMNS where OWNER = :own and TABLE_NAME = :nam) cl from ALL_TAB_COLUMNS where OWNER = :own and TABLE_NAME = :nam and DATA_TYPE IN ('CHAR', 'VARCHAR2', 'NUMBER') union all select decode(COLUMN_ID, 1, '', ',') sep, 'date_to_unixts(' || COLUMN_NAME || ')' col, '--' com, COLUMN_ID num, (select max(length(COLUMN_NAME))+16 from ALL_TAB_COLUMNS where OWNER = :own and TABLE_NAME = :nam) cl from ALL_TAB_COLUMNS where OWNER = :own and TABLE_NAME = :nam and DATA_TYPE = 'DATE' order by num ); select null, 'from '||:own||'.'||:nam||';', null from dual;
The query to export ALL_OBJECTS
is below.
SQL> spool all_objects-dump.sql SQL> start textract_dml sys all_objects select OWNER -- 1 , OBJECT_NAME -- 2 , SUBOBJECT_NAME -- 3 , OBJECT_ID -- 4 , DATA_OBJECT_ID -- 5 , OBJECT_TYPE -- 6 , date_to_unixts(CREATED) -- 7 , date_to_unixts(LAST_DDL_TIME) -- 8 , TIMESTAMP -- 9 , STATUS -- 10 , TEMPORARY -- 11 , GENERATED -- 12 , SECONDARY -- 13 from sys.all_objects; SQL> spool off
Carefully examine the all_objects-dump.sql
file, and adjust it to be precisely the query for the SQLite export - no more, and no less.
There might be reasons to make adjustments to the SQL produced by the script:
- If several tables are being exported to SQLite, it might be desirable to export them all as they appeared at a specific point in recent time, so the whole set of tables is “read-consistent.” This is done with the
AS OF
clause to aSELECT
statement, and the user must have the “flashback on <table>” privilege on each table in order to see a historical version. Oracle's UNDO tablespace must also have sufficient retention to reconstruct the recovery images. - Any additional
WHERE
clause content to limit row visibility should be added at this stage (as opposed to deleting rows from SQLite after transfer). - Only versions of SQLite after 3.25 have window functions. If the result of a window function is needed, and this ability is not available on the target SQLite implementation, add it in Oracle before transfer.
- Any transformations of the data (for example, substring functions to prune char/varchar fields of irrelevant or unneeded data, numeric adjustments specific to the target, etc.) are better performed as part of the transfer, as opposed to a post-transfer cleanup.
Orchestration - CSV
With a compatible SQLite database defined, and a query that will properly filter the extracted data, data transfer is now possible.
The approach of this section will entail CSV export/import mode set both in Oracle sqlplus
and the SQLite command line shell. This transfer will take place over a “named pipe” (fifo); the CSV file will never actually exist on disk. This technique is commonly used with Oracle's export/import tools.
A POSIX shell script to perform this transfer, tested with Debian's dash, is presented below:
$ cat textract.sh #!/bin/sh set -eu # http://redsymbol.net/articles/unofficial-bash-strict-mode/ #set -euo pipefail # http://redsymbol.net/articles/unofficial-bash-strict-mode/ IFS=$'\n\t' head='SET HEADING OFF FEEDBACK OFF MARKUP CSV ON' #TNS='/ as sysdba' #TNS="(description= \ # (address= \ # (protocol=tcp) \ # (host=db.myco.com) \ # (port=1521)) \ # (connect_data= \ # (sid=orcl)))" [ -z "${1:-}" ] && { echo " $(basename "$0" ".sh") - table extractor -f Destination sqlite file -n Source oracle database (TNS descriptor) -s Oracle extraction SQL file -t Destination sqlite table -u Oracle username "; exit; } while getopts f:n:p:s:t:u: arg do case "$arg" in f) fILE="$OPTARG" ;; n) TNS="$OPTARG" ;; # p) pASS="$OPTARG" ;; s) sqlX="$OPTARG" ;; t) tABL="$OPTARG" ;; u) uSER="$OPTARG" ;; esac done getv () { getVAR="$1"; getVAL="$(eval printf %s '"${'"$getVAR"':-}"')" [ -z "$getVAL" ] && { shift; echo "$@"; read getINP [ -z "$getINP" ] && return; eval "$getVAR"='"'$getINP'"'; }; return 1; } getv sqlX "Oracle Extraction SQL filename: " && exit getv TNS -n "Source Oracle database: " && exit getv fILE -n "Destination sqlite file: " && exit getv tABL -n "Destination sqlite table: " && exit getv uSER -n "User (default $LOGNAME): " && uSER="$LOGNAME" echo "Logging in / executing as $uSER." DIR="$(mktemp -d "/tmp/$(basename "$0" ".sh")XXXXX")" FIFO="$DIR/fifo.lst" GET="$DIR/get.sql" PUT="$DIR/put.sql" mkfifo "$FIFO" trap 'stty echo; rm -fv "$FIFO" "$GET" "$PUT"; rmdir -v "$DIR"' EXIT [ -z "${pASS:-}" ] && { stty -echo; getv pASS -n "Password: " && : ; stty echo; echo ''; } echo "$head spool $FIFO" > "$GET" cat "$sqlX" >> "$GET" ( sqlplus -silent "$uSER/$pASS@$TNS" "@${GET}" | sed -n '0~10000p' ) & unset pASS echo ".mode csv .import ${FIFO} ${tABL}" > "$PUT" sqlite3 "$fILE" ".read $PUT" wait
The script must have several components in order to perform a table transfer, and will prompt for them if they are not specified. Of particular note is the TNS (Transparent Network Substrate); there are commented examples allowing the script to be locked to a particular target, either executed as a local sysDBA, or with a full descriptor for a remote database - if these are not set and the -n
option is absent, it will be prompted.
Some variant of this script could run on Windows. The Busybox Win32 POSIX shell does not implement mkfifo
; Cygwin or the WSL might be a better choice.
The requirements and behavior of the script are best described with an example export run:
$ time ./textract.sh -f orasys.db3 -u fishecj \ -s all_objects-dump.sql -t all_objects Logging in / executing as fishecj. Password: "SCOTT","EMPPK",,10775,20472,"INDEX",1366667998,1369409340,"2013-04-22:21:59:58","VALID","N","N","N" "TIGER","DEPT",,29780,,"SYNONYM",1371564246,1371564246,"2013-06-18:14:04:06","VALID","N","N","N" removed ‘/tmp/textractpH3qJ/fifo.lst’ removed ‘/tmp/textractpH3qJ/get.sql’ removed ‘/tmp/textractpH3qJ/put.sql’ rmdir: removing directory, ‘/tmp/textractpH3qJ’ real 0m7.885s user 0m0.448s sys 0m0.078s
- There is a commented
-p
option to specify the Oracle account password on the command line; this should not be used, as it will be entered into the command history and may appear as an argument in /proc/*/cmdline. - Once the script is running, both
sqlplus
andsqlite
will communicate via thefifo.lst
created in the temporary directory, and should be visible as consuming CPU in thetop
report. - One out of every 10,000 CSV lines processed is printed on the standard output to indicate continuing activity; if this is not desired, remove the
sed
and redirect thesqlplus
output to the null device. For more thorough silence, remove the-v
options in the exittrap
. - The
date_to_unixts
function must be visible when executing the script, either as an object created in the local account, or via a synonym granted by another owner. - If running under bash or Korn, use the full "redsymbol" strict mode. Debian's dash POSIX shell does not implement
set -o
; if your shell does, enable it. - When debugging the script, it is useful to comment out the
trap
trigger. - There is a SQLite library that will provide the equivalent of external tables, for an alternate approach to CSV processing.
The transfer of ALL_OBJECTS
should be relatively quick (7 seconds, as recorded above). Confirm the row counts and DATE
s from Oracle sqlplus and SQLite:
SQL> select count(*) from all_objects; COUNT(*) ---------- 23699 SQL> select created, last_ddl_time from all_objects where object_name='EMPPK'; CREATED LAST_DDL_TIME ------------------- ------------------- 2013-04-22 21:59:58 2013-05-24 15:29:00
These Oracle results do not quite agree with sqlite, if we check the DATE
s carefully:
$ sqlite3 orasys.db3 SQLite version 3.36.0 2021-06-18 18:36:39 Enter ".help" for usage hints. sqlite> select count(*) from all_objects; 23699 sqlite> select * from all_objects where object_name='EMPPK'; SCOTT|EMPPK||10775.0|20472.0|INDEX|1366667998|1369409340|2013-04-22:21:59:58|VALID|N|N|N sqlite> select datetime(CREATED, 'unixepoch', 'localtime'), datetime(LAST_DDL_TIME, 'unixepoch', 'localtime') from all_objects where object_name='EMPPK'; 2013-04-22 16:59:58|2013-05-24 10:29:00 sqlite> select datetime(CREATED, 'unixepoch'), datetime(LAST_DDL_TIME, 'unixepoch') from all_objects where object_name='EMPPK'; 2013-04-22 21:59:58|2013-05-24 15:29:00 sqlite> .quit
The DATE
s differ by the local time zone, unless the localtime argument is removed.
We can create and populate a small Oracle table to confirm that we can exceed the 2038 limit:
create table birthday (who varchar2(4000), theday date); insert into birthday values ('joe', to_date('2030/04/01','YYYY/MM/DD')); insert into birthday values ('jane', to_date('2040/05/01','YYYY/MM/DD')); insert into birthday values ('jed', to_date('2050/06/01','YYYY/MM/DD')); insert into birthday values ('jill', to_date('9999/12/31','YYYY/MM/DD')); commit;
After the export process above:
sqlite> select * from birthday; joe|1901232000 jane|2219443200 jed|2537654400 jill|253400000000 sqlite> select who, datetime(theday, 'unixepoch', 'localtime') from birthday; joe|2030-03-31 19:00:00 jane|2040-04-30 18:00:00 jed|2050-05-31 18:00:00 jill|9999-12-05 02:53:20 sqlite> select who, datetime(theday, 'unixepoch') from birthday; joe|2030-04-01 00:00:00 jane|2040-05-01 00:00:00 jed|2050-06-01 00:00:00 jill|9999-12-05 08:53:20
The epoch time doesn't take us quite to the end of the year 9,999, but it's close enough.
dg4odbc
The CSV transfer mode above is far more reliable than dg4odbc
, as will be shown in the results of this section. Database links directly into SQLite are attractive as they can preserve transaction integrity, but the interface is brittle and prone to failure.
Without an active Oracle support contract, dg4odbc
should not be relied upon for production use with SQLite. Those using Oracle's free XE database without support should avoid dg4odbc
if at all possible. The ODBC server package contains the warning: “The driver is usable but may contain bugs. Use it on your own risk.”
To proceed with ODBC installation, the following OS packages must be installed:
# yum install sqlite sqlite-devel unixODBC unixODBC-devel Loaded plugins: langpacks, ulninfo Package sqlite-3.7.17-8.el7_7.1.x86_64 already installed and latest version Package unixODBC-2.3.1-14.0.1.el7.x86_64 already installed and latest version Resolving Dependencies --> Running transaction check ---> Package sqlite-devel.x86_64 0:3.7.17-8.el7_7.1 will be installed ---> Package unixODBC-devel.x86_64 0:2.3.1-14.0.1.el7 will be installed --> Finished Dependency Resolution Dependencies Resolved ================================================================================ Package Arch Version Repository Size ================================================================================ Installing: sqlite-devel x86_64 3.7.17-8.el7_7.1 ol7_latest 104 k unixODBC-devel x86_64 2.3.1-14.0.1.el7 ol7_latest 54 k Transaction Summary ================================================================================ Install 2 Packages Total download size: 158 k Installed size: 552 k Is this ok [y/d/N]: y Downloading packages: (1/2): unixODBC-devel-2.3.1-14.0.1.el7.x86_64.rpm | 54 kB 00:00 (2/2): sqlite-devel-3.7.17-8.el7_7.1.x86_64.rpm | 104 kB 00:03 -------------------------------------------------------------------------------- Total 47 kB/s | 158 kB 00:03 Running transaction check Running transaction test Transaction test succeeded Running transaction Installing : sqlite-devel-3.7.17-8.el7_7.1.x86_64 1/2 Installing : unixODBC-devel-2.3.1-14.0.1.el7.x86_64 2/2 Verifying : unixODBC-devel-2.3.1-14.0.1.el7.x86_64 1/2 Verifying : sqlite-devel-3.7.17-8.el7_7.1.x86_64 2/2 Installed: sqlite-devel.x86_64 0:3.7.17-8.el7_7.1 unixODBC-devel.x86_64 0:2.3.1-14.0.1.el7 Complete!
The SQLite ODBC package is distributed in source form. For systems based on RedHat, there is a source RPM that can easily build installable packages. Obtain it, and built the binary package with the following command (which will require an installled C compiler):
rpmbuild --rebuild sqliteodbc-0.9998-1.src.rpm
When the build is complete, the compiled packages can be found in the following location:
$ ll ~/rpmbuild/RPMS/x86_64/ total 408 -rw-rw-r--. 1 cfisher cfisher 120440 Jul 12 19:08 sqliteodbc-0.9998-1.x86_64.rpm -rw-rw-r--. 1 cfisher cfisher 294704 Jul 12 19:08 sqliteodbc-debuginfo-0.9998-1.x86_64.rpm
Install the desired packages on target servers:
# rpm -Uvh ~fishecj/rpmbuild/RPMS/x86_64/sqliteodbc-0.9998-1.x86_64.rpm Preparing... ################################# [100%] Updating / installing... 1:sqliteodbc-0.9998-1 ################################# [100%] odbcinst: Driver installed. Usage count increased to 1. Target directory is /etc
Examine the README:
# rpm -ql sqliteodbc /usr/lib64/libsqlite3_mod_blobtoxy-0.9998.so /usr/lib64/libsqlite3_mod_blobtoxy.so /usr/lib64/libsqlite3_mod_csvtable-0.9998.so /usr/lib64/libsqlite3_mod_csvtable.so /usr/lib64/libsqlite3_mod_impexp-0.9998.so /usr/lib64/libsqlite3_mod_impexp.so /usr/lib64/libsqlite3_mod_xpath-0.9998.so /usr/lib64/libsqlite3_mod_xpath.so /usr/lib64/libsqlite3_mod_zipfile-0.9998.so /usr/lib64/libsqlite3_mod_zipfile.so /usr/lib64/libsqlite3odbc-0.9998.so /usr/lib64/libsqlite3odbc.so /usr/share/doc/sqliteodbc-0.9998 /usr/share/doc/sqliteodbc-0.9998/ChangeLog /usr/share/doc/sqliteodbc-0.9998/README /usr/share/doc/sqliteodbc-0.9998/license.terms
Note the driver specification has been added:
# cat /etc/odbcinst.ini [ODBC Driver 11 for SQL Server] Description=Microsoft ODBC Driver 11 for SQL Server Driver=/opt/microsoft/msodbcsql/lib64/libmsodbcsql-11.0.so.2270.0 Threading=1 UsageCount=1 [SQLITE3] Description=SQLite ODBC 3.X Driver=/usr/lib64/libsqlite3odbc.so Setup=/usr/lib64/libsqlite3odbc.so Threading=2 FileUsage=1 UsageCount=1
A bogus global handle has also been added:
# tail -2 /etc/odbc.ini [SQLite3 Datasource] Driver=SQLITE3
As root, add a driver for a specific SQLite database file, created in the last section (note that the /etc/odbc.ini
is global for all users, subject to filesystem permissions; private odbc.ini files can be created, but are not covered here):
echo ' [SQLitec1] Description=textract PoC Driver=/usr/lib64/libsqlite3odbc.so Database=/home/oracle/orasys.db3 # optional lock timeout in milliseconds Timeout=2000' >> /etc/odbc.ini
Test the driver with ODBC SQL tool:
# isql SQLitec1 +---------------------------------------+ | Connected! | | | | sql-statement | | help [tablename] | | quit | | | +---------------------------------------+ SQL> select count(*) from all_objects; +-----------+ | count(*) | +-----------+ | 23699 | +-----------+ SQLRowCount returns 0 1 rows fetched SQL> select datetime(CREATED, 'unixepoch'), datetime(LAST_DDL_TIME, 'unixepoch') from all_objects where object_name='EMPPK'; +--------------------------------+--------------------------------------+ | datetime(CREATED, 'unixepoch') | datetime(LAST_DDL_TIME, 'unixepoch') | +--------------------------------+--------------------------------------+ | 2013-04-22 21:59:58 | 2013-05-24 15:29:00 | +--------------------------------+--------------------------------------+ SQLRowCount returns 0 1 rows fetched SQL> quit
With the ODBC server components in place, we are ready to configure connectivity for an init.ora specific to the Oracle dg4odbc
client.
First, the DBA must add a TNS client entry to the following file:
$ORACLE_HOME/network/admin/tnsnames.ora
The TNS entry should be similar to the SID for the local database. Be sure to match its configured host exactly:
sqlitec1 = (DESCRIPTION= (ADDRESS= (PROTOCOL=tcp) (HOST=1.2.3.4) (PORT=1524)) (CONNECT_DATA= (SID=SQLitec1)) (HS=OK))
Once the TNS client entry is in place, the database listener must be configured to launch the ODBC service on demand, by adding an entry to the following file:
$ORACLE_HOME/network/admin/listener.ora
The new service can be added to the existing SID list:
(SID_DESC= (SID_NAME=SQLitec1) (ORACLE_HOME=/home/oracle/Ora19) (PROGRAM=dg4odbc) )
At this point, restart the listener:
lsnrctl restart
Hopefully, your listener reports the new ODBC SID:
Service "SQLitec1" has 1 instance(s). Instance "SQLitec1", status UNKNOWN, has 1 handler(s) for this service...
Finally, create a new init.ora file that targets dg4odbc
in the following location:
$ORACLE_HOME/hs/admin/initSQLitec1.ora
This file should have the following contents:
HS_FDS_CONNECT_INFO = SQLitec1 HS_FDS_TRACE_LEVEL = 4
The tnsping utility can confirm that the listener is aware of the new ODBC service:
$ $ORACLE_HOME/bin/tnsping sqlitec1 TNS Ping Utility for Linux: Version 19.0.0.0.0 - Production on 09-JUL-2021 12:18:01 Copyright (c) 1997, 2020, Oracle. All rights reserved. Attempting to contact (description=(address=(protocol=tcp)(host=1.2.3.4)(port=1521))(connect_data=(sid=SQLitec1))) OK (0 msec)
The DBA is now ready to create a database link for a connectivity test:
$ $ORACLE_HOME/bin/sqlplus '/ as sysdba' SQL*Plus: Release 19.0.0.0.0 - Production on Thu Jul 8 13:45:19 2021 Version 19.3.0.0.0 Copyright (c) 1982, 2019, Oracle. All rights reserved. Connected to: Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production Version 19.3.0.0.0 SQL> create database link lite_c1 connect to nouser identified by nopass using 'sqlitec1'; Database link created. SQL> select count(*) from all_objects@lite_c1; COUNT(*) -------- 23699
Notes:
- The bogus nouser/nopass above are fake credentials; SQLite does not use password-based access controls, but the connection will fail without them.
- The tnsnames.ora entry can be omitted, if a full TNS descriptor (with host and port) appears after
using
. - If the database is running on RedHat 8, the
SELECT
may fail until the line:HS_FDS_SHAREABLE_NAME=/usr/lib64/libodbc.so
is added to thehs/admin/initSQLitec1.ora
file. It might also be necessary to add the lineSET EXTPROC_DLLS=/usr/lib64/libodbc.so:/usr/lib64/libsqlite3odbc.so
to thehs/admin/extproc.ora
file.
Basic DML can now be performed upon SQLite databases from within Oracle:
SQL> delete from all_objects@lite_c1; 23699 rows deleted. SQL> commit; Commit complete.
However, attempting to reinsert from the ALL_OBJECTS complex view...
SQL> insert into all_objects@lite_c1 select OWNER -- 1 , OBJECT_NAME -- 2 , SUBOBJECT_NAME -- 3 , OBJECT_ID -- 4 , DATA_OBJECT_ID -- 5 , OBJECT_TYPE -- 6 , date_to_unixts(CREATED) -- 7 , date_to_unixts(LAST_DDL_TIME) -- 8 , TIMESTAMP -- 9 , STATUS -- 10 , TEMPORARY -- 11 , GENERATED -- 12 , SECONDARY -- 13 from sys.all_objects;
...meets repeated failure, even if many offending lines are replaced by NULL:
* ERROR at line 8: ORA-02070: database LITE_C1 does not support DECODE in this context * ERROR at line 9: ORA-02070: database LITE_C1 does not support operator 169 in this context * ERROR at line 10: ORA-02070: database LITE_C1 does not support operator 169 in this context * ERROR at line 11: ORA-02070: database LITE_C1 does not support TO_CHAR in this context * ERROR at line 12: ORA-02070: database LITE_C1 does not support DECODE in this context * ERROR at line 13: ORA-02070: database LITE_C1 does not support DECODE in this context
Insertion from complex views seems to be a real problem for the database link; this activity is more suited to the CSV transfer method in the previous section.
Primitive DML does appear to be functional:
SQL> insert into all_objects@lite_c1 (OWNER,OBJECT_NAME) values ('foo','bar'); 1 row created. SQL> commit; Commit complete.
The data inserted by Oracle does appear within SQLite:
SQL> select * from all_objects@lite_c1; OWNER OBJECT_NAME ... ----- ----------- ... foo bar ...
It is also possible for Oracle to report a remote SQLite version if it is created as a view:
sqlite> create view my_version as select sqlite_version(); sqlite> select * from my_version; 3.36.0
Oracle is able to select this view:
SQL> select * from my_version@lite_c1; sqlite_version() ---------------- 3.7.17 SQL> select version from v$instance; VERSION ----------------- 19.0.0.0.0
On RedHat 8, the shell file
command is able to report the version of the last SQLite that wrote to the database:
$ file /home/oracle/orasys.db3 /home/oracle/orasys.db3: SQLite 3.x database, last written using SQLite version 3036000
The ODBC instance uses an older SQLite version, and the downgrade is visible after ODBC has engaged in DML:
$ file /home/oracle/orasys.db3 /home/oracle/orasys.db3: SQLite 3.x database, last written using SQLite version 3026000
ODBC also appears to fail with some SQL due to callback links in the SQLite ODBC implementation, with the error:
ORA-02025: all tables in the SQL statement must be at the remote database
Realistically, this ODBC interface is not robust.
PHP to the rescue
As brittle as ODBC might be, PHP is equally reliable. PHP can easily handle the LONG
datatypes, only a single client process is active at any time, and it runs as a shell interpreter in addition to its normal position as a web server engine.
Let's consider the previous ALL_VIEWS
that contained a LONG
column:
SQL> desc all_views Name Null? Type ----------------------------------------- -------- -------------- OWNER NOT NULL VARCHAR2(30) VIEW_NAME NOT NULL VARCHAR2(30) TEXT_LENGTH NUMBER TEXT LONG TYPE_TEXT_LENGTH NUMBER TYPE_TEXT VARCHAR2(4000) OID_TEXT_LENGTH NUMBER OID_TEXT VARCHAR2(4000) VIEW_TYPE_OWNER VARCHAR2(30) VIEW_TYPE VARCHAR2(30) SUPERVIEW_NAME VARCHAR2(30) EDITIONING_VIEW VARCHAR2(1)
We could add a TEXT
column in place for this table:
$ sqlite3 orasys.db3 SQLite version 3.36.0 2021-06-18 18:36:39 Enter ".help" for usage hints. sqlite> create table all_views ( ...> OWNER TEXT -- 1 * ...> , VIEW_NAME TEXT -- 2 ...> , TEXT_LENGTH NUM -- 3 ...> , text text ...> , TYPE_TEXT_LENGTH NUM -- 5 * ...> , TYPE_TEXT TEXT -- 6 ...> , OID_TEXT_LENGTH NUM -- 7 ...> , OID_TEXT TEXT -- 8 ...> , VIEW_TYPE_OWNER TEXT -- 9 ...> , VIEW_TYPE TEXT -- 10 ...> , SUPERVIEW_NAME TEXT -- 11 ...> , EDITIONING VIEW TEXT -- 12 ...> );
Then a simple script can copy the table row-by-row, using bind variables. I am presenting two versions below, one using the native PHP SQLite class, and another using the PHP Data Objects (PDO) generic abstraction class. The PDO version seems slightly more clear.
PHP Native SQLite$ cat dump_all_views-sqlite.php #!/usr/local/bin/php -f <?php $litedb = new SQLite3('orasys.db3'); $conn = oci_connect(getenv('DBUSER'), getenv('DBPASS'), getenv('TNS')); $curs = oci_parse($conn, <<<'EndOfSQL' select * from all_views where owner = upper(:the_owner) EndOfSQL ); oci_bind_by_name($curs, ':the_owner', $argv[1], -1); oci_execute($curs); $q = 'insert into ALL_VIEWS values ('; for($n = 0; $n < oci_num_fields($curs); $n++) $q .= ($n == 0 ? '?' : ',?'); $q .= ')'; $litedb->exec('BEGIN;'); $insert = $litedb->prepare($q); while($values = oci_fetch_array($curs, OCI_NUM + OCI_RETURN_NULLS + OCI_RETURN_LOBS)) { foreach($values as $num => $val) $insert->bindParam($num + 1, $values[$num]); $insert->execute(); $insert->reset(); } $litedb->exec('COMMIT;'); oci_free_statement($curs); oci_close($conn); ?>PHP PDO SQLite
$ cat dump_all_views-PDO.php #!/usr/local/bin/php -f <?php $litedb = new PDO('sqlite:orasys.db3'); $conn = oci_connect(getenv('DBUSER'), getenv('DBPASS'), getenv('TNS')); $curs = oci_parse($conn, <<<'EndOfSQL' select * from all_views where owner = upper(:the_owner) EndOfSQL ); oci_bind_by_name($curs, ':the_owner', $argv[1], -1); oci_execute($curs); $sql = 'insert into ALL_VIEWS values ('; for($n = 0; $n < oci_num_fields($curs); $n++) $sql .= ($n == 0 ? '?' : ',?'); $sql .= ')'; $litedb->beginTransaction(); $insert = $litedb->prepare($sql); while($values = oci_fetch_array($curs, OCI_NUM + OCI_RETURN_NULLS + OCI_RETURN_LOBS)) $insert->execute($values); $litedb->commit(); oci_free_statement($curs); oci_close($conn); ?>
Bind variables are much more important for Oracle than SQLite. Oracle has a “library cache” that is able to recognize identical SQL and “soft parse” it, which is much faster than non-cached SQL that must be “hard parsed.” In addition to the performance gains on a high-volume SQL statement, bind variables also provide some protection against “SQL injection,” a benefit likely shared by SQLight. The previous textract SQL scripts above have used bind variables.
SQLite bind variables can be of the Oracle type (:the_owner in the scripts above) or the Sybase/SQL Server type (?,?,?...). The latter are bound by numeric position, and the former by explicit name. The SQLite class begins the ? numeric positions at 1, not zero.
Below is an example execution that will extract this table (note that transmitting the password in this way will make it visible in /proc/*/environ
, which is not a best practice):
$ export DBUSER=fishecj \ TNS='(description=(address=(protocol=tcp)(host=1.2.3.4)(port=1521))(connect_data=(sid=orcl)))' $ stty -echo; read DBPASS; stty echo; echo '' $ DBPASS="$DBPASS" ./dump_all_views.php SYS
Transaction control seems to boost SQLite performance, as the script will run much more slowly if the beginTransaction
and commit
are removed.
Now the full content of Oracle's ALL_VIEWS
for the sys user can be seen in SQLite:
$ sqlite3 orasys.db3 SQLite version 3.36.0 2021-06-18 18:36:39 Enter ".help" for usage hints. sqlite> select count(*) from all_views; 3812 sqlite> select text from all_views where view_name='ALL_VIEWS'; select u.name, o.name, v.textlength, v.text, t.typetextlength, t.typetext, t.oidtextlength, t.oidtext, t.typeowner, t.typename, decode(bitand(v.property, 134217728), 134217728, (select sv.name from superobj$ h, "_CURRENT_EDITION_OBJ" sv where h.subobj# = o.obj# and h.superobj# = sv.obj#), null), decode(bitand(v.property, 32), 32, 'Y', 'N'), decode(bitand(v.property, 16384), 16384, 'Y', 'N') from sys."_CURRENT_EDITION_OBJ" o, sys.view$ v, sys.user$ u, sys.typed_view$ t where o.obj# = v.obj# and o.obj# = t.obj#(+) and o.owner# = u.user# and (o.owner# = userenv('SCHEMAID') or o.obj# in (select oa.obj# from sys.objauth$ oa where oa.grantee# in ( select kzsrorol from x$kzsro ) ) or /* user has system privileges */ exists (select null from v$enabledprivs where priv_number in (-45 /* LOCK ANY TABLE */, -47 /* SELECT ANY TABLE */, -48 /* INSERT ANY TABLE */, -49 /* UPDATE ANY TABLE */, -50 /* DELETE ANY TABLE */) ) )
Conclusion
Some users bring great demands to legacy databases that cannot be easily met due to scalability or licensing issues (Oracle database retail licensing is $47,500 per cpu core). SQLite is an option to offload such users, and the pursuit of such an option inspired this article.
There are certainly a few eccentricities in SQLite that make it radically different from conventional RDBMS environments. The extreme syntax tolerance, type fluidity, and lack of permissions are a shock for many users.
It is also far from difficult to find many DBAs who have never heard of SQLite. While it is commonly "under the radar" for IT administration, developers are known to (ab)use it as they become frustrated with formal SOX-compliant corporate databases. When these developers move on, the clandestine data stores often don't receive proper attention.
The pervasiveness of SQLite in mobile phones and other consumer electronics also borders on frightening. From this perspective, it is a primary surveillance tool, with dangerous impacts on privacy.
It is better for an IT organization to have familiarity with SQLite than not, and auditors to have awareness of appropriate use and data exfiltration dangers. The faith in access control in a client/server RDBMS can dissolve with a bad SQLite experience; hopefully this article will prevent that experience for a few.