Monday, March 7, 2011

Teradata ODBC, Python pyodbc on Fedora Core 14 x32

Background:


This posting is an extension of an earlier posting and contains much of the same content starting out. Instead of trying to get the example programs to compile and run our objective is to get python working with ODBC through the pyodbc project.


The motivation of this project has to do with a desire of a colleague of mine to do development work on an installation of Fedora Core 14.

Keep in mind that FC is not at this time a supported distro for Teradata!

We are starting with a Fedora Core 14 32bit guest machine running in VMWare Workstation. Host is Windows XP SP3. All libraries required to get VMWare tools installed and running have been pre-installed so there may be some dependencies already in place that are not listed. The Fedora core instance is fully updated as of today. SELinux has also already been disabled.



Lets Go! - Download Driver Package:

Alright, first thing is first, lets start out by downloading the Teradata drivers. Go to:
www.teradata.com/downloadcenter/
Follow ODBC -> Linux and download the correct package (TTU 13.10 LINUX-INDEP tdodbc.13.10.00.01 in my case)

Keep going until you download the .tar.gz package (tdodbc__LINUX_INDEP.13.10.00.01-1.tar.gz)

Make sure to first read the accompanying README file before proceeding it will detail any dependencies that *may* need to be installed on your OS. Since FC is not supported we cannot rely on this documentation to be all inclusive but it can provide great hints.

Once you have downloaded the drivers, move into the directory holding the downloaded file. Create a directory to hold expanded files and then untar the archive:


[cj@fc14 Downloads]$ mkdir /tmp/td
[cj@fc14 Downloads]$ mv tdodbc__LINUX_INDEP.13.10.00.01-1.tar.gz /tmp/td/
[cj@fc14 Downloads]$ cd /tmp/td
[cj@fc14 td]$ tar xof tdodbc__LINUX_INDEP.13.10.00.01-1.tar
.gz

You should have now have 3 component archives and readme files in your working directory.
tdodbc*
tdicu*
TeraGSS*

TeraGSS has redhat and suse versions.

Take a moment out to remember that FC is a Redhat OS and expand each archive:

[cj@fc14 td]$ tar xof tdicu__linux_indep.13.10.00.00-1.tar.gz
[cj@fc14 td]$ tar xof TeraGSS_redhatlinux-i386__linux_i386.13.10.00.02-1.tar.gz
[cj@fc14 td]$ tar xof tdodbc__linux_x64.13.10.00.01.tar.gz


Installing Teradata Driver Packages:

Now install the tdicu package:

[cj@fc14 td]$ cd tdicu
[cj@fc14 tdicu]$ sudo rpm -ihv tdicu-13.10.00.00-1.noarch.rpm
 


Output:
Adding TD_ICU_DATA environment variable to /etc/profile file.
Adding TD_ICU_DATA environment variable to /etc/csh.login file.

Since the package updated /etc/profile, lets first load the profile changes into our shell (just in case):
[cj@fc14 tdicu]$ source /etc/profile

Next lets install TeraGSS:
[cj@fc14 tdicu]$ cd ../TeraGSS/
[cj@fc14 TeraGSS]$ sudo rpm -ihv TeraGSS_redhatlinux-i386-13.10.00.02-1.i386.rpm

Output:
Preparing...                ########################################### [100%]
   1:TeraGSS_redhatlinux-i38########################################### [100%]
/usr/teragss/redhatlinux-i386/13.10.00.02/bin/tdgssconfig: error while loading shared libraries: libstdc++-libc6.2-2.so.3: cannot open shared object file: No such file or directory

Alright, so we are missing a dependecy, lets use yum to install that:
[cj@fc14 TeraGSS]$ sudo yum provides libstdc++-libc6.2-2.so.3

Output:
compat-libstdc++-296-2.96-143.i686 : Compatibility 2.96-RH standard C++
                                   : libraries
Repo        : fedora
Matched from:
Other       : libstdc++-libc6.2-2.so.

[cj@fc14 TeraGSS]$ sudo yum install compat-libstdc++-296-2.96-143.i686

Finally lets install the tdodbc package:
[cj@fc14 td]$ cd TeraGSS
[cj@fc14 TeraGSS]$ cd ../tdodbc
[cj@fc14 tdodbc]$ sudo rpm -ihv tdodbc-13.10.00.01-1.noarch.rpm

Output:
/var/tmp/rpm-tmp.9paHwf: /opt/teradata/client/13.10/odbc_32/bin/set_default_version: /usr/bin/ksh: bad interpreter: No such file or directory

Hmmm... its looking for korn shell, lets install that package from yum:
[cj@fc14 tdodbc]$ sudo yum install ksh

Really quick lets see where FC has put ksh:
[cj@fc14 tdodbc]$ which ksh

Output:
/bin/ksh

That is not going to work for us because the package is looking for ksh in /usr/bin/ksh , lets go ahead and create a link so that the Teradata installer can work:
[cj@fc14 tdodbc]$ sudo ln -s /bin/ksh /usr/bin/ksh

Now lets uninstall the tdodbc package and reinstall it to make sure we don't run into any more problems:
[cj@fc14 tdodbc]$ sudo rpm -e tdodbc
[cj@fc14 tdodbc]$ sudo rpm -ihv tdodbc-13.10.00.01-1.noarch.rpm

Perfect!



Python ODBC, PyODBC installation:


Looking through the documentation we can see that we will need to pre-install the gcc compiler and the unixODBC-devel  package, lets get that done.
[cj@fc14 ~]$ sudo yum install gcc unixODBC-devel



Next we need to install the pyodbc package so that we can access our database natively from within Python. The pyodbc project can be accessed from here: http://code.google.com/p/pyodbc/


Make sure you go to the downloads section and get the latest pyodbc package. At the time of this writing that is pyodbc-2.1.8.


For Linux you will want to download the source installation package (for me this is: pyodbc-2.1.8.zip). From the command line move to the directory you have saved the pyodbc archive to and unzip the archive.

[cj@fc14 ~]$ cd ~/Downloads/
[cj@fc14 Downloads]$ unzip pyodbc-2.1.8.zip


Now change into the pyodbc directory and attempt to build and install pyodbc:

[cj@fc14 Downloads]$ cd pyodbc-2.1.8
[cj@fc14 pyodbc-2.1.8]$ sudo python setup.py build

Output:
gcc: error trying to exec 'cc1plus': execvp: No such file or directoryerror: command 'gcc' failed with exit status 1

Oops, we need g++ installed, lets do that:
[cj@fc14 pyodbc-2.1.8]$ sudo yum install gcc-c++

Let's try and build that pyodbc package again:
[cj@fc14 pyodbc-2.1.8]$ sudo python setup.py build

Output:
fatal error: Python.h: No such file or directory

K, we need to install the Python development package, lets do that also:
[cj@fc14 pyodbc-2.1.8]$ sudo yum install python-devel

Awesome, that works, now lets go ahead and install the package:
[cj@fc14 pyodbc-2.1.8]$ sudo python setup.py install

Good, that all worked straight off.

odbc.ini and odbcinst.ini:
Before working further we need to configure ODBC further so that it can find the Teradata drivers. There are 2 main configuration files that you can modify to perform this configuration. The odbcinst.ini and odbc.ini files. There are plenty of resources online which will describe better than I can the breadth of these configuration and I would encourage anyone to look into it a little bit.

For our purpose though Teradata conveniently generates samples of these files that work well out of the box. Go ahead and run the following commands:
[cj@fc14 ~]$ sudo updatedb
[cj@fc14 ~]$ locate odbc.ini

Output:
/opt/teradata/client/13.10/odbc_32/odbc.ini
/opt/teradata/client/ODBC_32/odbc.ini

Go ahead and copy the first sample file found into your home directory as .odbc.ini (a '.' in front of the file name makes it 'private' or 'invisible'). Similarly copy the odbcinst.ini sample file to .odbcinst.ini within your home directory:
[cj@fc14 ~]$ cp /opt/teradata/client/13.10/odbc_32/odbc.ini ~/.odbc.ini
[cj@fc14 ~]$ cp /opt/teradata/client/13.10/odbc_32/odbcinst.ini ~/.odbcinst.ini

Connecting to Teradata DB using PyODBC:

To test pyodbc we are going to create a small sample file (test.py) that will simply make a connection to our Teradata database and perform a select statement.

Lets create and edit our sample file (Insert your own username and password values):
[cj@fc14 ~]$ touch test.py
[cj@fc14 ~]$ chmod +x test.py
[cj@fc14 ~]$ vim test.py
The contents of the sample (test.py) file will look like this:

#!/usr/bin/env pythonimport pyodbcfrom pprint import pprintcnx = pyodbc.connect("DRIVER={Teradata};DBCNAME=ip_or_fqdn;UID=user;PWD=password")cursor = cnx.cursor()cursor.execute('SELECT * FROM dbc.dbcinfo')rows = cursor.fetchall()for row in rows:        pprint(row)

OK, let's try running our sample program:
[cj@fc14 ~]$ ./test.py

Output:
pyodbc.Error: ('01000', "[01000] [unixODBC][Driver Manager]Can't open lib '/opt/teradata/client/13.10/odbc_32/lib/tdata.so

K. Once again in my life I am not going to reveal how long this takes to figure out but I will tell you that tdata.so is present and accessible. The problem being reported has to do with a tdata.so having a missing dependency. Let's figure out what is missing and install it:
[cj@fc14 ~]$ ldd /opt/teradata/client/13.10/odbc_32/lib/tdata.so

Output:
libstdc++.so.5 => not found

(Use 'yum provides' to find the package that will install that library)
[cj@fc14 ~]$ sudo yum install compat-libstdc++-33-3.2.3-68.i686

Alright, once more lets go ahead and run our test program to see if we have it working:
[cj@fc14 ~]$ ./test.py

Output:
pyodbc.Error: ('200', '[200] [unixODBC][eaaa[DCTrdt rvr nbet e aao tig (0) (SQLDriverConnectW)')

OK, that's not going to work. That doesn't even look like english or anything else that pretty much looks like random memory being printed out to the screen....
Again, we are going to skip over an hour of 2 of investigation in regards to how this is tracked down. But, it turns out that Teradata has it's own version of odbc drivers and requires these libraries in order to function correctly. If you remember back we installed the unixODBC-devel package and those are the libraries that pyodbc is currently trying to read from, the difference in expected package and linked package is what is causing the problem. To fix this we need to destroy existing links within our /usr/lib directory and create links to our [Teradata]/lib directory:

[cj@fc14 lib]$ sudo rm libodbc.so
[cj@fc14 lib]$ sudo rm libodbc.so.2
[cj@fc14 lib]$ sudo rm libodbcinst.so

[cj@fc14 lib]$ sudo ln -s /opt/teradata/client/13.10/odbc_32/lib/libodbc.so libodbc.so
[cj@fc14 lib]$ sudo ln -s /opt/teradata/client/13.10/odbc_32/lib/libodbc.so libodbc.so.2
[cj@fc14 lib]$ sudo ln -s /opt/teradata/client/13.10/odbc_32/lib/libodbcinst.so libodbcinst.so


Alright, let's run that test program once more:
[cj@fc14 ~]$ ./test.py

Output:
Fatal Python error: Unable to set SQL_ATTR_CONNECTION_POOLING attribute.
Aborted (core dumped)

Well at least we are back in english... :-(
So we need to turn off connection pooling, in pyodbc we can accomplish this by adding the following line:
pyodbc.pooling = False
To the begining of our test program, test.py now looks like this:

#!/usr/bin/env pythonimport pyodbcfrom pprint import pprint
pyodbc.pooling = False
cnx = pyodbc.connect("DRIVER={Teradata};DBCNAME=ip_or_fqdn;UID=user;PWD=password")
cursor = cnx.cursor()cursor.execute('SELECT * FROM dbc.dbcinfo')rows = cursor.fetchall()for row in rows:        pprint(row)

Running our test program once more:
[cj@fc14 ~]$ ./test.py

Output:
pyodbc.Error: ('HY000', '[HY000] [DataDirect][ODBC lib] Unicode converter buffer overflow (0) (SQLDriverConnectW)')

Hmm doesn't like unicode either. We can disable this also by adding the 'ansi' flag to our pyodbc.connect() call. We need to set 'ansi' to True in this case to force pyodbc to try and connect using the non-unicode connection calls.

Our updated test.py file looks like this:

#!/usr/bin/env pythonimport pyodbcfrom pprint import pprint
pyodbc.pooling = False
cnx = pyodbc.connect("DRIVER={Teradata};DBCNAME=ip_or_fqdn;UID=user;PWD=password", ansi=True)
cursor = cnx.cursor()cursor.execute('SELECT * FROM dbc.dbcinfo')rows = cursor.fetchall()for row in rows:        pprint(row)

OK, lets run our test program once more:
[cj@fc14 ~]$ ./test.py

Output:
pyodbc.Error: ('HY000', '[HY000] [Teradata][ODBC Teradata Driver] Major Status=0x04bd Minor Status=0x20800002-[terasso]Cannot load TDGSS library. (0) (SQLDriverConnect)')

From a previous post where we got the C/C++ samples up and running we know to do the following in order to clear this error:
[cj@fc14 ~]$ sudo /opt/teradata/teragss/redhatlinux-i386/13.10.00.02/bin/run_tdgssconfig

Once that has finished executing we try once more to get our test program working:
[cj@fc14 ~]$ ./test.py

Output:
('RELEASE', '12.00.03.15')('VERSION', '12.00.03.17d')('LANGUAGE SUPPORT MODE', 'Standard')

Fin:

Well there we have it, we have verified that we can at least execute SELECT statements against our Teradata database using Python and the pyodbc package.
Please feel free to leave any comments if you run into any issues not presented in this post, assuming I have the time I am usually more than happy to help figure something out.

3 comments:

  1. Thank you for the post it helped me much to setup tdodbc on gentoo. However I have not been able to connect to teradata with pyodbc.E very time I try to connect I get the following error:

    Traceback (most recent call last):
    File "./tdodbc.py", line 4, in
    pyodbc.connect('DSN=mdw;Driver=/opt/teradata/client/13.10/odbc_64/lib/tdata.so;Username=vmtest;Password=vmtest;DBCNAME=192.168.56.102;')
    pyodbc.Error: ('IM002', '[IM002] [DataDirect][ODBC lib] Data source name not found and no default driver specified (0) (SQLDriverConnectW)')

    I tried to connect with the DSN, and/or any combination of configurations for the DSN to no avail.
    I even verified that the process read every lib and config file with strace.
    I have the same results on RHEL5,
    however I'm able to query the teradata with isql from unixODBC package and with tdxodbc binary from the tdodbc package.

    I don't have any more idea by now, did you had the same issues?

    ReplyDelete
  2. Nevermind, my problem was trying with version 2.1.6 of pyodbc which aparently does not work with teradata. Once I tried with 2.1.8 it worked like a charm.
    Just in case a gentoo user want to check it out you can fetch tdodbc and pyodbc ebuilds from my overlay at git://github.com/bdx/bdxol.git

    Thank you again.

    ReplyDelete
  3. Hi, Thanks for the amazing post.
    I followed the exact same procedure.

    sudo /opt/teradata/teragss/linux-i386/14.10.00.06/bin/run_tdgssconfig
    This ran successfully for me.

    My Python code is:

    import pyodbc
    pyodbc.pooling = False
    conn = pyodbc.connect('DRIVER={Teradata};DBCNAME=;UID=;PWD=;',ansi=True)

    Traceback (most recent call last):
    File "test.py", line 3, in
    conn = pyodbc.connect('DRIVER={Teradata};DBCNAME=;UID=;PWD=',ansi=True)
    pyodbc.Error: ('IM003', '[IM003] [DataDirect][ODBC lib] Specified driver could not be loaded (0) (SQLDriverConnect)')

    Can you suggest what could be the issue here?
    Thanks a ton in advance! :)

    ReplyDelete