LOAD TABLE statement

Description

Imports data into a database table from an external ASCII-format file.

Syntax

LOAD [ INTO ] TABLE [ owner ].table-name
... ( load-specification [, ...] )
... FROM { 'filename-string' | filename-variable } [, ...]
... [ CHECK CONSTRAINTS { ON | OFF } ]
... [ DEFAULTS { ON | OFF } ]
... QUOTES OFF
... ESCAPES OFF
... [ FORMAT { 'ascii' | 'binary' } ]
... [ DELIMITED BY 'string' ]
... [ STRIP { ON | OFF } ]
... [ WITH CHECKPOINT { ON | OFF } ]
... [ { BLOCK FACTOR number | BLOCK SIZE number } ]
... [ BYTE ORDER { NATIVE | HIGH | LOW } ]
... [ LIMIT number-of-rows ]
... [ NOTIFY number-of-rows ]
... [ ON FILE ERROR { ROLLBACK | FINISH | CONTINUE} ]
... [ PREVIEW { ON | OFF } ]
... [ ROW DELIMITED BY 'delimiter-string' ]
... [ SKIP number-of-rows ]
... [ WORD SKIP  number ]
... [ START ROW ID number ]
... [ UNLOAD FORMAT ]
... [ IGNORE CONSTRAINT constrainttype [, ...] ]
... [ MESSAGE LOGstringROW LOGstring’ [ ONLY LOG logwhat [, ...] ]
... [ LOG DELIMITED BYstring’ ]

Parameters

load-specification:

{ column-name [ column-spec ] | FILLER ( filler-type ) }

column-spec:

{ ASCII ( input-width ) | BINARY [ WITH NULL BYTE ] | PREFIX { 1 | 2 | 4 } | 'delimiter-string' | DATE ( input-date-format ) | DATETIME ( input-datetime-format ) } [ NULL ( { BLANKS | ZEROS | 'literal', ... } ) ]

filler-type:

{ input-width | PREFIX { 1 | 2 | 4 } | 'delimiter-string' }

constrainttype:

{ CHECK integer | UNIQUE integer | NULL integer | FOREIGN KEY integer | DATA VALUE integer | ALL integer }

logwhat:

{ CHECK | ALL | NULL | UNIQUE | DATA VALUE | FOREIGN KEY | WORD }

Examples

Example 1

LOAD TABLE product
( id   ASCII(6),
FILLER(1),
name   ASCII(15),
FILLER(1),
description   '\x09',
size   ASCII(2),
FILLER(1),
color   '\x09',
quantity   PREFIX 2,
unit_price   PREFIX 2,
FILLER(2) )
FROM 'C:\\mydata\\source1.dmp'
QUOTES OFF
ESCAPES OFF
BYTE ORDER LOW
NOTIFY 1000

Example 2

LOAD TABLE product_new
( id,
name,
description,
size,
color   '\x09'   NULL( 'null', 'none', 'na' ),
quantity   PREFIX 2,
unit_price   PREFIX 2 )
FROM '/s1/mydata/source2.dump', '/s1/mydata/source3.dump'
QUOTES OFF
ESCAPES OFF
BLOCKSIZE 100000
FORMAT ascii
DELIMITED BY '\x09'
ON FILE ERROR CONTINUE
ROW DELIMITED BY '\n'

Example 3

load table PTAB1(
       ck1         ','  null ('NULL') ,
       ck3fk2c2      ','  null ('NULL') ,
       ck4         ','  null ('NULL') ,
       ck5         ','  null ('NULL') ,
       ck6c1        ','  null ('NULL') ,
       ck6c2        ','  null ('NULL') ,
       rid         ','  null ('NULL')  )
FROM 'ri_index_selfRI.inp'
       row delimited by '\n'
       LIMIT 14   SKIP 10
       IGNORE CONSTRAINT UNIQUE 2, FOREIGN KEY 8
       word skip 10 quotes off escapes off strip
       off

Usage

The LOAD TABLE statement allows efficient mass insertion into a database table from a file with ASCII or binary data.

The LOAD TABLE options also let you control load behavior when integrity constraints are violated and to log information about the violations.

If WITH CHECKPOINT ON is not specified, the file used for loading must be retained in case recovery is required. If WITH CHECKPOINT ON is specified, a checkpoint is carried out after loading, and recovery is guaranteed even if the data file is then removed from the system.

You can use LOAD TABLE on a temporary table, but the temporary table must have been declared with ON COMMIT PRESERVE ROWS, or the next COMMIT removes the rows you have loaded.

You can also specify more than one file to load data. In the FROM clause, you specify each filename-string separated by commas. However, Sybase IQ cannot guarantee that all the data can be loaded because of memory constraints. If memory allocation fails, the entire load transaction is rolled back. The files are read one at a time, and they are processed in a left-to-right order as specified in the FROM clause. Any SKIP or LIMIT value only applies in the beginning of the load, not for each file.

NoteWhen loading a multiplex database, use absolute (fully qualified) paths in all file names. Do not use relative path names.

Sybase IQ supports loading from both ASCII and binary data, and it supports both fixed- and variable-length formats. To handle all of these formats, you must supply a load-specification to tell Sybase IQ what kind of data to expect from each “column” or field in the source file. The column-spec lets you define the following formats:

Sybase IQ has built-in load optimizations for common date, time, and datetime formats. If your data to be loaded matches one of these formats, you can significantly decrease load time by using the appropriate format. For a list of these formats, and details about optimizing performance when loading date and datetime data, see Chapter 7, “Moving Data In and Out of Databases” in the Sybase IQ System Administration Guide.

You can also specify the date/time field as an ASCII fixed-width field (as described above) and use the FILLER(1) option to skip the column delimiter. For more information about specifying date and time data, see Date and time data types or Chapter 7, “Moving Data In and Out of Databases” in the Sybase IQ System Administration Guide.

The NULL portion of the column-spec indicates how to treat certain input values as NULL values when loading into the table column. These characters can include BLANKS, ZEROS, or any other list of literals you define. When specifying a NULL value or reading a NULL value from the source file, the destination column must be able to contain NULLs.

ZEROS are interpreted as follows: the cell is set to NULL if (and only if) the input data (before conversion, if ASCII) is all binary zeros (and not character zeros).

For example, if your LOAD statement includes col1 date('yymmdd') null(zeros) and the date is 000000, you receive an error indicating that 000000 cannot be converted to a DATE(4). To get load to insert a NULL value in col1 when the data is 000000, write the NULL clause as null('000000'), or modify the data to equal binary zeros and use NULL(ZEROS).

If the length of a VARCHAR cell is zero and the cell is not NULL, you get a zero-length cell. For all other data types, if the length of the cell is zero, Sybase IQ inserts a NULL. This is ANSI behavior. For non-ANSI treatment of zero-length character data, set the Non_Ansi_Null_Varchar database option.

Another important part of the load-specification is the FILLER option. It indicates you want to skip over a specified field in the source input file. For example, there may be characters at the end of rows or even entire fields in the input files that you do not want to add to the table. As with the column-spec definition, FILLER lets you specify ASCII fixed length of bytes, variable length characters delimited by a separator, and binary fields using PREFIX bytes.

filename-string The filename-string is passed to the server as a string. The string is therefore subject to the same formatting requirements as other SQL strings. In particular:

LOAD TABLE employee
FROM 'c:\\temp\\input.dat' ...

The following describes each of the clauses of the statement:

WORD SKIP Allows the load to continue when it encounters data longer than the limit specified when the word index was created.

If a row is not loaded because a word exceeds the maximum permitted size, a warning is written to the .iqmsg file. WORD size violations can be optionally logged to the MESSAGE LOG file and rejected rows logged to the ROW LOG file specified in the LOAD TABLE statement.

QUOTES This parameter is optional and the default is ON. With QUOTES turned on, LOAD TABLE expects input strings to be enclosed in quote characters. The quote character is either an apostrophe (single quote) or a quotation mark (double quote). The first such character encountered in a string is treated as the quote character for the string. String data must be terminated with a matching quote.

With QUOTES ON, column or row delimiter characters can be included in the column value. Leading and ending quote characters are assumed not to be part of the value and are excluded from the loaded data value.

To include a quote character in a value with QUOTES ON, use two quotes. For example, the following line includes a value in the third column that is a single quote character:

‘123 High Stree, Anytown’, ‘(715)398-2354’,’’’’

With STRIP turned on (the default), trailing blanks are stripped from values before they are inserted. Trailing blanks are stripped only for non-quoted strings. Quoted strings retain their trailing blanks. Leading blank or TAB characters are trimmed only when the QUOTES setting is ON.

The data extraction facility provides options for handling quotes (TEMP_EXTRACT_QUOTES, TEMP_EXTRACT_QUOTES_ALL, and TEMP_EXTRACT_QUOTE). If you plan to load back the extracted file with string fields which contain column or row delimiter under default ASCII extraction, use the TEMP_EXTRACT_BINARY option for the extract and the FORMAT ‘binary’ and QUOTES OFF options for LOAD TABLE.

Limits:

Exceptions:

For an example of the QUOTES option, see “Bulk loading data using the LOAD TABLE statement” in Chapter 7, “Moving Data In and Out of Databases” in the Sybase IQ System Administration Guide.

CHECK CONSTRAINTS This option defaults to ON. When you specify CHECK CONSTRAINTS ON, check constraints are evaluated and you are free to ignore or log them.

Setting CHECK CONSTRAINTS OFF causes Sybase IQ to ignore all check constraint violations. This can be useful, for example, during database rebuilding. If a table has check constraints that call user-defined functions that are not yet created, the rebuild fails unless this option is set to OFF.

This option is mutually exclusive to the following options. If any of these options are specified in the same load, an error results:

DEFAULTS If the DEFAULTS option is ON (the default) and the column has a default value, that value is used. If the DEFAULTS option is OFF, any column not present in the column list is assigned NULL.

The setting for the DEFAULTS option applies to all column DEFAULT values, including AUTOINCREMENT.

For detailed information on the use of column DEFAULT values with loads and inserts, see “Using column defaults” in Chapter 9, “Ensuring Data Integrity” in the Sybase IQ System Administration Guide.

ESCAPES If you omit a column-spec definition for an input field and ESCAPES is ON (the default), characters following the backslash character are recognized and interpreted as special characters by the database server. Newline characters can be included as the combination \n\, other characters can be included in data as hexadecimal ASCII codes, such as \x09 for the tab character. A sequence of two backslash characters (\\ ) is interpreted as a single backslash. For Sybase IQ, you must set this option OFF.

FORMAT Sybase IQ supports ASCII and binary input fields. The format is usually defined by the column-spec described above. If you omit that definition for a column, by default Sybase IQ uses the format defined by this option. Input lines are assumed to have ascii (the default) or binary fields, one row per line, with values separated by the column delimiter character.

DELIMITED BY If you omit a column delimiter in the column-spec definition, the default column delimiter character is a comma. You can specify an alternative column delimiter by providing a single ASCII character or the hexadecimal character representation. The DELIMITED BY clause is as follows:

... DELIMITED BY '\x09' ...

To use the newline character as a delimiter, you can specify either the special combination '\n' or its ASCII value '\x0a'. Although you can specify up to four characters in the column-spec delimiter-string, you can specify only a single character in the DELIMITED BY clause

STRIP With STRIP turned on (the default), trailing blanks are stripped from values before they are inserted. This is effective only for VARCHAR data; it does not apply to ASCII fix-width inserts. To turn the STRIP option off, the clause is as follows:

... STRIP OFF ...

Trailing blanks are stripped only for nonquoted strings. Quoted strings retain their trailing blanks. As an alternative, the FILLER option lets you be more specific in the number of bytes to strip instead of all the trailing spaces. It is more efficient for Sybase IQ to have this option off, and it adheres to the ANSI standard when dealing with trailing blanks. (char data is always padded, so this option only affects varchar data.)

WITH CHECKPOINT The default setting is OFF. If set to ON, a checkpoint is issued after successfully completing and logging the statement.

If WITH CHECKPOINT ON is not specified, and recovery is subsequently required, the data file used to load the table is needed for the recovery to complete successfully. If WITH CHECKPOINT ON is specified, and recovery is subsequently required, it begins after the checkpoint, and the data file need not be present.

BLOCK FACTOR Specifies blocking factor, or number of records per block used when a tape was created. This option is not valid for inserts from variable-length input fields; use the BLOCKSIZE option instead. However, it does affect all file inserts (including from disk) with fixed-length input fields, and it can dramatically affect performance. You cannot specify this option along with the BLOCK SIZE option. The default is 10,000.

BLOCK SIZE Specifies the default size in bytes in which input should be read. This option only affects variable length input data read from files; it is not valid for fixed length input fields. It is similar to BLOCK FACTOR, but there are no restrictions on the relationship of record size to block size. You cannot specify this option along with the BLOCK FACTOR option. The default is 500,000.

BYTE ORDER Specifies the byte order during reads. This option applies to all binary input fields. If none are defined, this option is ignored. Sybase IQ always reads binary data in the format native to the machine it is running on (default is NATIVE). You can also specify:

LIMIT Specifies the maximum number of rows to insert into the table. The default is 0 for no limit. The maximum is 2GB - 1.

NOTIFY Specifies that you be notified with a message each time the specified number of rows is successfully inserted into the table. The default is every 100,000 rows. The value of this option overrides the value of the NOTIFY_MODULUS database option.

ON FILE ERROR Specifies the action Sybase IQ takes when an input file cannot be opened because it does not exist or you have incorrect permissions to read the file. You can specify one of the following:

Only one ON FILE ERROR clause is permitted.

PREVIEW Displays the layout of input into the destination table including starting position, name, and data type of each column. Sybase IQ displays this information at the start of the load process. If you are writing to a log file, this information is also included in the log. This option is especially useful with partial-width inserts.

ROW DELIMITED BY Specifies a string up to 4 bytes in length that indicates the end of an input record. You can use this option only if all fields within the row are any of the following:

You cannot use this option if any input fields contain binary data. With this option, a row terminator causes any missing fields to be set to NULL. All rows must have the same row delimiters, and it must be distinct from all column delimiters. The row and field delimiter strings cannot be an initial subset of each other. For example, you cannot specify “*” as a field delimiter and “*#” as the row delimiter, but you could specify “#” as the field delimiter with that row delimiter.

If a row is missing its delimiters, Sybase IQ returns an error and rolls back the entire load transaction. The only exception is the final record of a file where it rolls back that row and returns a warning message. On Windows, a row delimiter is usually indicated by the newline character followed by the carriage return character. You might need to specify this as the delimiter-string (see above for description) for either this option or FILLER.

SKIP Lets you define a number of rows to skip at the beginning of the input tables for this load. The default is 0.

START ROW ID Specifies the record identification number of a row in the Sybase IQ table where it should start inserting. This option is used for partial-width inserts, which are inserts into a subset of the columns in the table. By default, new rows are inserted wherever there is space in the table, and each insert starts a new row. Partial-width inserts need to start at an existing row. They also need to insert data from the source file into the destination table positionally by column, so you must specify the destination columns in the same order as their corresponding source columns. Define the format of each input column with a column-spec. The default is 0. For more information about partial-width inserts see Chapter 7, “Moving Data In and Out of Databases” in the Sybase IQ System Administration Guide.

Use the START ROW ID option for partial-width inserts only. If the columns being loaded already contain data, the insert fails.

UNLOAD FORMAT Specifies that the file has Sybase IQ internal unload formats for each column created by an earlier version of Sybase IQ (before Version 12.0). This load option has the following restrictions:

ON PARTIAL INPUT ROW Specifies the action to take when a partial input row is encountered during a load. You can specify one of the following:

IGNORE CONSTRAINT Specifies whether to ignore CHECK, UNIQUE, NULL, DATA VALUE, and FOREIGN KEY integrity constraint violations that occur during a load and the maximum number of violations to ignore before initiating a rollback. Specifying each constrainttype has the following result:

If CHECK, UNIQUE, NULL, or FOREIGN KEY is not specified in the IGNORE CONSTRAINT clause, then the load rolls back on the first occurrence of each of these types of integrity constraint violation.

If DATA VALUE is not specified in the IGNORE CONSTRAINT clause, then the load rolls back on the first occurrence of this type of integrity constraint violation, unless the database option CONVERSION_ERROR = OFF. If CONVERSION_ERROR = OFF, a warning is reported for any DATA VALUE constraint violation and the load continues.

When the load completes, an informational message regarding integrity constraint violations is logged in the .iqmsg file. This message contains the number of integrity constraint violations that occurred during the load and the number of rows that were skipped.

MESSAGE LOG Specifies the names of files in which to log information about integrity constraint violations and the types of violations to log. Timestamps indicating the start and completion of the load are logged in both the MESSAGE LOG and the ROW LOG files. Both MESSAGE LOG and ROW LOG must be specified, or no information about integrity violations is logged.

Various combinations of the IGNORE CONSTRAINT and MESSAGE LOG options result in different logging actions, as indicated in Table 6-11.

Table 6-11: LOAD TABLE logging actions

IGNORE CONSTRAINT specified?

MESSAGE LOG specified?

Action

yes

yes

All ignored integrity constraint violations are logged, including the user specified limit, before the rollback.

no

yes

The first integrity constraint violation is logged before the rollback.

yes

no

Nothing is logged.

no

no

Nothing is logged. The first integrity constraint violation causes a rollback.

NoteSybase strongly recommends setting the IGNORE CONSTRAINT option limit to a nonzero value, if you are logging the ignored integrity constraint violations. If a single row has more than one integrity constraint violation, a row for each violation is written to the MESSAGE LOG file. Logging an excessive number of violations affects the performance of the load.

LOG DELIMITED BY Specifies the separator between data values in the ROW LOG file. The default separator is a comma.

For more details on the contents and format of the MESSAGE LOG and ROW LOG files, see “Bulk loading data using the LOAD TABLE statement” in Chapter 7, “Moving Data In and Out of Databases” in the Sybase IQ System Administration Guide.


Side effects

None.

Standards

Permissions

The permissions required to execute a LOAD TABLE statement depend on the database server -gl command line option, as follows:

For more information, see the -gl command line option in “Server command-line switches” on page 8 in Chapter 1, “Running the Database Server” in the Sybase IQ Utility Guide.

LOAD TABLE also requires an exclusive lock on the table.

See also

INSERT statement

“LOAD_ZEROLENGTH_ASNULL option”

“NON_ANSI_NULL_VARCHAR option”

“Bulk loading data using the LOAD TABLE statement” in Chapter 7, “Moving Data In and Out of Databases” in the Sybase IQ System Administration Guide

“Monitoring disk space usage,” Chapter 1, “Troubleshooting Hints,” in the Sybase IQ Troubleshooting and Recovery Guide