A low-level ODBC interface for the Julia programming language
Installation through the Julia package manager:
julia> Pkg.init() # Creates julia package repository (only runs once for all packages)
julia> Pkg.add("ODBC") # Creates the ODBC repo folder and downloads the ODBC package + dependancy (if needed)
julia> using ODBC # Loads the ODBC module for use (needs to be run with each new Julia instance)
Exported functions, macros, types, and variables include:
-
ODBC.connect(dsn; usr="", pwd="")
ODBC.connect
requires thedsn
string argument as the name of a pre-defined ODBC datasource. Valid datasources (DSNs) must first be setup through the ODBC administrator (or IODBC, unixODBC, etc.) prior to connecting in Julia. Note the use ofODBC.
beforeconnect
, this is to prevent method ambiguity with theBase.connect
family of methods.The
usr
andpwd
named arguments are optional as they may already be defined in the datasource.ODBC.connect
returns aConnection
type which contains basic information about the connection and ODBC handle pointers.ODBC.connect
can be used by storing theConnection
type in a variable to be able to disconnect or facilitate handling multiple connections like so:co = ODBC.connect("mydatasource",usr="johndoe",pwd="12345")
But it's unneccesary to store the
Connection
, as an exportedconn
variable holds the most recently createdConnection
type and other ODBC functions (i.e.query
) will use it by default in the absence of a specified connection. -
disconnect(connection::Connection=conn)
disconnect
closes a connection type, frees all handles and resets the default connectionconn
as necessary. If invoked with no arguments (i.e.disconnect()
), the default connectionconn
is closed. -
advancedconnect(conn_string::String)
advancedconnect
implements the native ODBC SQLDriverConnect function which allows flexibility in connecting to a datasource through specifying a 'connection string' (e.g. "DSN=userdsn;UID=johnjacob;PWD=jingle;") See ODBC API documentation (http://goo.gl/uXTuk) for additional details.If the connection string doesn't contain enough information for the driver to connect, the user will be prompted with the additional information needed.
Furthermore, on Windows, if
advancedconnect()
is called without arguments the ODBC administrator will be brought up where the user can select the DSN to which to connect, even allowing the user to create a datasource or add a driver.(An excellent resource for learning how to construct connection strings for various DBMS/driver configurations is http://www.connectionstrings.com/)
-
query(connection::Connection=conn, querystring; file=:DataFrame,delim='\t')
If a connection type isn't specified as the first positional argument, the query will be executed against the default connection (stored in the exported variable
conn
if you'd like to inspect).Once the query is executed, the resultset is stored in a
DataFrame
by default (file=:DataFrame
). Otherwise, the user may specify a file name to which the resultset is to be written, along with the desired file delimiter (defaultdelim='\t'
). Depending on DBMS capability, users may also pass multiple query statements in a single query call and the resultsets will be returned in an array of DataFrames, or the user may specify an array of filename strings andChar
delimiters into which the results will be written.For the general user, a simple
query(querystring)
is enough to return a single resultset in a DataFrame. Results are stored in the passed connection type's resultset field. (i.e.conn.resultset
). Results are stored by default to avoid immediate garbarge collection and provide access for the user even if the resultset returned byquery()
isn't stored in a variable. -
querymeta(conn::Connection=conn, querystring; file=:DataFrame,delim='\t')
querymeta
is really just the 1st half of thequery
function. A query string is sent to the DBMS, executed, and metadata (i.e. rows, columns, types, column names, etc.) is returned to the user, avoiding actually returning the dataset. The returned information is actually stored in theMetadata
type, so the information may be programmatically examined (try runningnames(Metadata)
to see its fields). Runningquerymeta
is useful for inspecting the results of large queries while avoiding the overhead of returning the actual dataset into memory. The function signature is identical toquery
for ease in switching between the two thoughquerymeta
ignores thefile
anddelim
arguments. -
listdrivers()
Takes no arguments. Returns a list of installed ODBC drivers registered in the ODBC administator (IODBC, unixODBC, etc.).
-
listdsns()
Takes no arguments. Returns a list of defined datasources (DSNs) registered in the ODBC administator (IODBC, unixODBC, etc.). The datasource names can be used as the 1st argument in
ODBC.connect(dsn)
.
-
sql"..."
sql"..."
is a Julia string literal implemented by the@sql_str
macro. It is equivalent to callingquery(querystring)
as you can see from the actual definition below:macro sql_str(s) query(s) end
-
Connection
Stores information about a DSN connection. Names include
dsn
,number
(countsConnection
types specific to each DSN),dbc_ptr
andstmt_ptr
as internal connection and statement handle pointers, andresultset
which stores the last resultset returned from aquery
orquerymeta
call. -
Metadata
Stores information about an executed query, returned byquerymeta
. Names includequerystring
(the query sent to be executed),cols
(# of columns in resultset),rows
(# of rows in resultset),colnames
(column names to be returned in resultset),coltypes
(SQL types of resultset columns),colsizes
(size in bytes of resultset columns),coldigits
(max number of digits for numeric resultset columns; though not always implemented correctly by ODBC driver), andcolnulls
(whether the resultset column is nullable).
conn
Global, exported variable that initially holds a nullConnection
type until a connection is successfully made byODBC.connect
oradvancedconnect
. Is used byquery
andquerymeta
as the default datasourceConnection
if none is explicitly specified.Connections
Global, exported variable of typeArray{Connection,1}
, that holdsConnection
types. When multiple calls toODBC.connect
oradvancedconnect
are made,Connections
stores eachConnection
type to manage the number of DSN connections. It is also referenced when the default connectionconn
is disconnected and reset toConnections[end]
if other connections exist, or a nullConnection
type otherwise.
-
We've had limited ODBC testing between various platforms, so it may happen that
ODBC.jl
doesn't recognize your ODBC shared library (also know as the Driver Manager, basically the middleman betweenODBC.jl
and the RDBMS). The current approach is to check a variety of the most widely used ODBC libraries and produce an error if not found. If this happens, you'll need to manually locate your ODBC shared library (searching for something along the lines oflibodbc
orlibiodbc
, or installing it if you haven't yet) and then run the following:const odbc_dm = "path/to/library/libodbc.so" (or .dylib on OSX)
*Note that the file is
odbc32
on Windows, but should never have a problem being found (ships by default). That said, if you end up doing this, open an issue on GitHub to let me know what the name of your ODBC library is and I can add is as one of the defaults to check for.
- Create SQL typealiases and use in conjunction with Julia-C typealiases for ODBC_API (for more transparency and because we can)
- Metadata tools: This would involve specilized queries for examining DBMS schema, tables, views, columns, with associated metadata and possibly statistics. I know the driver managers support SQLTables and SQLStatistics, so it should be pretty simple to implement these.
- Create, Update Table functions (also auto-detect regular queries as these kinds of DDL queries): Pretty self-explanatory.
- Support more SQL data types: Date, Time, Intervals. Right now, all main bitstypes, character and binary formats (short, long, float, double, char, etc.) are supported, but the date and time data types are read as strings. Other implementations in C use structs to read them in and Julia is still fragile on struct support as far as I know. As Julia struct compatibility improves it's an eventual (I think RODBC package still only reads dates as strings...)
- Asynchronous querying: This might be a longshot, but the later ODBC API supports async querying through polling, so it would be cool to find a way to implement this. I'm not sure how useful it would be long term or exactly how it would be implemented (Call asyncquery() and then later call querydone() to see if it's finished?), but because the underlying api is capable this could be some cool functionality.
- How to deal with Unicode/ANSI function calling? (I think we're ok here, but not sure)