Reading Excel Files with Perl

Sometimes you need to read data from Excel files with Perl. There are several Perl Modules (Spreadsheet::ParseExcel, Spreadsheet::ParseExcel::Simple) that make that job quite easy. Spreadsheet::ParseExcel::Simple is an lightweight abstraction of Spreadsheet::ParseExcel.

Sample Code:

#!/usr/bin/perl -w

use Spreadsheet::ParseExcel;

my $parser = new Spreadsheet::ParseExcel();
my $xls = $parser->Parse("PerlExcelTest.xls")
  or die "Can't read File!";

# Loop over all tables in Excel file
for(my $i = 0; $i < $xls->{SheetCount}; $i++) {
  $sheet = $xls->{Worksheet}[$i];

  print "\nSheet: " . $sheet->{Name} . "\n";

  # has the current sheet cells?
  next unless (exists ($sheet->{MaxRow}) and (exists ($sheet->{MaxCol})));

  # loop over the row of the current sheet
  foreach my $row ($sheet->{MinRow} .. $sheet->{MaxRow}) {
    
    # loop over the cells in the current row
    foreach my $column ($sheet->{MinCol} .. $sheet->{MaxCol}) {
      next unless (defined $sheet->{Cells}[$row][$column]);

      print $sheet->{Cells}[$row][$column]->Value . ' | ';
    }
    print "\n";
  }
}

During my tests I came across several issues:

  • If you create Excel files with OpenOffice numeric values show up as string GENERAL. There is a patch for that problem. Please note also that you could use $cell->Val instead of $cell->Value.
  • I also have problems to extract date values from the excel file. In that case I am not able to determine the type of a date cell. Normally I should see Date but I turns out to be Numeric.

Access MySQL Databases with Oracle HSODBC

Sometimes you need to access data stored in an different RDBMS like MySQL. Oracle provides an convenient way to access external data-sources via ODBC. To use Oracle HSODBC you can follow my step-by-step notes:

  • Install unixODBC and the MySQL ODBC Driver
    For Fedora Core 6 I installed the following RPMs:

    • unixODBC-2.2.11-7.1
    • mysql-connector-odbc-3.51.12-2.2
  • Configure /etc/odbc.ini
    [ODBC Data Sources]
    TimeSheetTest = MySQL ODBC PHP.TimeSheetTest DSN
    
    [TimeSheetTest]
    Driver       = /usr/lib/libmyodbc3.so
    Description  = MySQL ODBC 3.51 Driver DSN
    SERVER       = localhost
    PORT         = 3306
    USER         = root
    Password     = SECRET
    Database     = timesheet
    OPTION       = 3
    SOCKET       =
    
  • Test the ODBC Connection via isql
    [frank@w0004]$ isql TimeSheetTest
    Connected!
    sql-statement
    help [tablename]
    quit
    SQL> 
    
  • Oracle Configuration
    First ensure that the shell environment used to start the Oracle listener includes the following
    environment variables:

    export ODBCINI=/etc/odbc.ini
    export ODBCSYSINI=/etc
    

    To integrate that ODBC connection into Oracle you have to edit several configuration files:

    $ORACLE_HOME/hs/admin/initTimeSheetTest.ora

    HS_FDS_CONNECT_INFO = TimeSheetTest
    HS_FDS_TRACE_LEVEL = off
    HS_FDS_SHAREABLE_NAME = /usr/lib/libmyodbc3.so
    

    $ORACLE_HOME/network/admin/listener.ora

    ...
    SID_LIST_LISTENER =
    ...
        (SID_DESC =
          (PROGRAM = hsodbc)
          (ORACLE_HOME = /opt/oracle/ora10g)
          (SID_NAME = TimeSheetTest)
          (ENVS=LD_LIBRARY_PATH = /usr/lib:/opt/oracle/ora10g/lib)
        )
      )
    ...
    

    $ORACLE_HOME/network/admin/tnsnames.ora

    ...
    TIMESHEET = (DESCRIPTION = 
        (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = localhost)(PORT = 1521)))
        (CONNECT_DATA = (SID= TimeSheetTest))(HS=OK)
      )
    ...
    
  • Test the Configuration with tnsping
    $ tnsping timesheet
    
    TNS Ping Utility for Linux: Version 10.2.0.1.0 - Production on 05-SEP-2007 18:26:16
    
    Copyright (c) 1997, 2005, Oracle.  All rights reserved.
    
    Used TNSNAMES adapter to resolve the alias
    Attempting to contact (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)
    (HOST = localhost)(PORT = 1521))) (CONNECT_DATA = (SID= TimeSheetTest))(HS=OK))
    OK (10 msec)
    
  • Create an Oracle Link in your Oracle Database instance
    SQL> CREATE PUBLIC DATABASE LINK timesheet
      CONNECT TO "root" IDENTIFIED BY "SECRET" USING 'TIMESHEET';
    

Please note that MySQL is case-sensitive on tablenames, so you have to use:

SQL> select * from "timesheet_times"@timesheet;
...