LOAD TABLE statement

Description

Imports data into a database table from an external ASCII-format file.

Syntax

LOAD [ INTO ] TABLE [ owner ].table-name
... ( load-specification [, ...] )
... FROM { 'filename-string' | filename-variable } [, ...]
... [ CHECK CONSTRAINTS { ON | OFF } IGNORE CONSTRAINTconstrainttype[, ...] ]
... QUOTES OFF
... ESCAPES OFF
... [ FORMAT { 'ascii' | 'binary' } ]
... [ DELIMITED BY 'string' ]
... [ STRIP { ON | OFF } ]
... [ WITH CHECKPOINT { ON | OFF } ]
... [ { BLOCK FACTOR number | BLOCK SIZE number } ]
... [ BYTE ORDER { NATIVE | HIGH | LOW } ]
... [ LIMIT number-of-rows ]
... [ NOTIFY number-of-rows ]
... [ ON FILE ERROR { ROLLBACK | FINISH | CONTINUE} ]
... [ PREVIEW { ON | OFF } ]
... [ ROW DELIMITED BY 'delimiter-string' ]
... [ SKIP number-of-rows ]
... [ START ROW ID number ]
... [ UNLOAD FORMAT ]
... [ IGNORE CONSTRAINT constrainttype [, ...] ]
... [ MESSAGE LOGstringROW LOGstring’ [ ONLY LOG logwhat [, ...] ]
... [ LOG DELIMITED BYstring’ ]

Parameters

load-specification:

{ column-name [ column-spec ] | FILLER ( filler-type ) }

column-spec:

{ ASCII ( input-width ) | BINARY [ WITH NULL BYTE ] | PREFIX { 1 | 2 | 4 } | 'delimiter-string' | DATE ( input-date-format ) | DATETIME ( input-datetime-format ) } [ NULL ( { BLANKS | ZEROS | 'literal', ... } ) ]

filler-type:

{ input-width | PREFIX { 1 | 2 | 4 } | 'delimiter-string' }

constrainttype:

{ CHECK integer | UNIQUE integer | NULL integer | FOREIGN KEY integer | DATA VALUE integer | ALL integer }

logwhat:

{ CHECK | ALL | NULL | UNIQUE | DATA VALUE | FOREIGN KEY }

Examples

Example 1

LOAD TABLE product
( id   ASCII(6),
FILLER(1),
name   ASCII(15),
FILLER(1),
description   '\x09',
size   ASCII(2),
FILLER(1),
color   '\x09',
quantity   PREFIX 2,
unit_price   PREFIX 2,
FILLER(2) )
FROM 'C:\\mydata\\source1.dmp'
QUOTES OFF
ESCAPES OFF
BYTE ORDER LOW
NOTIFY 1000

Example 2

LOAD TABLE product_new
( id,
name,
description,
size,
color   '\x09'   NULL( 'null', 'none', 'na' ),
quantity   PREFIX 2,
unit_price   PREFIX 2 )
FROM '/s1/mydata/source2.dump', '/s1/mydata/source3.dump'
QUOTES OFF
ESCAPES OFF
BLOCKSIZE 100000
FORMAT ascii
DELIMITED BY '\x09'
ON FILE ERROR CONTINUE
ROW DELIMITED BY '\n'

Usage

The LOAD TABLE statement allows efficient mass insertion into a database table from a file with ASCII or binary data.

The LOAD TABLE options also allow you to control load behavior when integrity constraints are violated and to log information about the violations.

If the WITH CHECKPOINT ON clause is not specified, the file used for loading must be retained in case recovery is required. If WITH CHECKPOINT ON is specified, a checkpoint is carried out after loading, and recovery is guaranteed even if the data file is then removed from the system.

You can use LOAD TABLE on a temporary table, but the temporary table must have been declared with the ON COMMIT PRESERVE ROWS clause or the next COMMIT will remove the rows you've loaded.

You can also specify more than one file to load data. In the FROM clause, you specify each filename-string separated by commas. However, Sybase IQ cannot guarantee that all the data can be loaded because of memory constraints. If memory allocation fails, the entire load transaction is rolled back. The files are read one at a time, and they are processed in a left-to-right order as specified in the FROM clause. Any SKIP or LIMIT value only applies in the beginning of the load, not for each file.

NoteWhen loading a multiplex database, use absolute (fully-qualified) paths in all filenames. Do not use relative pathnames.

Sybase IQ supports loading from both ASCII and binary data, and it supports both fixed and variable length formats. To handle all of these formats, you must supply a load-specification to tell Sybase IQ what kind of data to expect from each “column” or field in the source file. The column-spec allows you to define these formats:

Sybase IQ has built-in load optimizations for common date, time and datetime formats. If your data to be loaded matches one of these formats, you can significantly decrease load time by using the appropriate format. For a list of these formats, and details about optimizing performance when loading date and datetime data, see Chapter 7, “Moving Data In and Out of Databases” in the Sybase IQ System Administration Guide.

You can also specify the date/time field as an ASCII fixed width field (as described above) and use the FILLER(1) option to skip the column delimiter. For more information about specifying date and time data, see Date and time data types or Chapter 7, “Moving Data In and Out of Databases” in the Sybase IQ System Administration Guide.

The NULL portion of the column-spec indicates how to treat certain input values as NULL values when loading into the table column. These characters can include BLANKS, ZEROS, or any other list of literals you define. When specifying a NULL value or reading a NULL value from the source file, the destination column must be able to contain NULLs.

ZEROS are interpreted as follows: the cell is set to NULL if (and only if) the input data (before conversion, if ASCII) is all binary zeros (and not character zeros).

For example, if your LOAD statement includes col1 date(‘yymmdd’) null(zeros) and the date is 000000, you will receive an error indicating that 000000 cannot be converted to a DATE(4). To get the load statement to insert a NULL value in col1 when the data is 000000, you must write the NULL clause as null(‘000000’), or modify the data to equal binary zeros and use NULL(ZEROS).

If the length of a VARCHAR cell is zero and the cell is not NULL, you get a zero-length cell. For all other data types, if the length of the cell is zero, Sybase IQ inserts a NULL. This is ANSI behavior. For non-ANSI treatment of zero-length character data, set the Non_Ansi_Null_Varchar database option.

Another important part of the load-specification is the FILLER option. It indicates you want to skip over a specified field in the source input file. For example, there may be characters at the end of rows or even entire fields in the input files that you do not want to add to the table. As with the column-spec definition, FILLER allows you to specify ASCII fixed length of bytes, variable length characters delimited by a separator, and binary fields using PREFIX bytes.

filename-string The filename-string is passed to the server as a string. The string is therefore subject to the same formatting requirements as other SQL strings. In particular:

LOAD TABLE employee
FROM 'c:\\temp\\input.dat' ...

The following describes each of the clauses of the statement.

QUOTES option If you omit a column-spec definition for an input field and QUOTES is on (the default), the LOAD statement looks for a quote character. The quote character is either an apostrophe (single quote) or a quotation mark (double quote). The first such character encountered in the input file is treated as the quote character for the input file. For Sybase IQ, you must set this option as OFF.

CHECK CONSTRAINTS option This option defaults to ON. When you specify CHECK CONSTRAINTS ON, check constraints are evaluated and you are free to ignore or log them.

Setting CHECK CONSTRAINTS OFF means that IQ ignores all check constraint violations. This can be useful, for example, during database rebuilding. If a table has check constraints that call user-defined functions that are not yet created, the rebuild fails unless this option is set to OFF.

This option is mutually exclusive to the following options. If any of these options are specified in the same load, an error results:

ESCAPES option If you omit a column-spec definition for an input field and ESCAPES is on (the default), characters following the backslash character are recognized and interpreted as special characters by the database server. Newline characters can be included as the combination \n, other characters can be included in data as hexadecimal ASCII codes, such as \x09 for the tab character. A sequence of two backslash characters (\\ ) is interpreted as a single backslash. For Sybase IQ, you must set this option as OFF.

FORMAT option Sybase IQ supports ASCII and binary input fields. The format is usually defined by the column-spec described above. If you omit that definition for a column, by default Sybase IQ uses the format defined by this option. Input lines are assumed to have ascii (the default) or binary fields, one row per line, with values separated by the column delimiter character.

DELIMITED BY option If you omit a column delimiter in the column-spec definition, the default column delimiter character is a comma. You can specify an alternative column delimiter by providing a single ASCII character or the hexadecimal character representation. The DELIMITED BY clause is as follows:

... DELIMITED BY '\x09' ...

To use the newline character as a delimiter, you can specify either the special combination '\n' or its ASCII value '\x0a'. Note that, while you can specify up to four characters in the column-spec delimiter-string, you can only specify a single character in the DELIMITED BY clause

STRIP option With STRIP turned on (the default), trailing blanks are stripped from values before they are inserted. This is effective only for VARCHAR data; it does not apply to ASCII fix-width inserts. To turn the STRIP option off, the clause is as follows:

... STRIP OFF ...

Trailing blanks are stripped only for non-quoted strings. Quoted strings retain their trailing blanks. As an alternative, the FILLER option allows you to be more specific in the number of bytes to strip instead of just all the trailing spaces. It is more efficient for Sybase IQ to have this option off, and it adheres to the ANSI standard when dealing with trailing blanks. (char data is always padded, so this option only affects varchar data.)

WITH CHECKPOINT option The default setting is OFF. If set to ON, a checkpoint is issued after successfully completing and logging the statement.

If WITH CHECKPOINT ON is not specified, and recovery is subsequently required, the data file used to load the table is needed for the recovery to complete successfully. If WITH CHECKPOINT ON is specified, and recovery is subsequently required, it will begin after the checkpoint, and the data file need not be present.

BLOCK FACTOR option Specifies blocking factor, or number of records per block, used when a tape was created. This option is not valid for inserts from variable length input fields; use the BLOCKSIZE option instead. However, it does affect all file inserts (including from disk) with fixed length input fields, and it can dramatically affect performance. You cannot specify this option along with the BLOCK SIZE option. The default is 10,000.

BLOCK SIZE option Specifies the default size in bytes in which input should be read. This option only affects variable length input data read from files; it is not valid for fixed length input fields. It is similar to BLOCK FACTOR, but there are no restrictions on the relationship of record size to block size. You cannot specify this option along with the BLOCK FACTOR option. The default is 500,000.

BYTE ORDER option Specifies the byte ordering during reads. This option applies to all binary input fields. If none are defined, this option is ignored. Sybase IQ always reads binary data in the format native to the machine it is running on (default is NATIVE). You can also specify:

LIMIT option Specifies the maximum number of rows you want to insert into the table. The default is 0 for no limit.

NOTIFY option Specifies that you be notified with a message each time the specified number of rows is successfully inserted into the table. The default is every 100,000 rows. The value of this option overrides the value of the NOTIFY_MODULUS database option.

ON FILE ERROR option Specifies the action Sybase IQ takes when an input file cannot be opened because it does not exist or you have incorrect permissions to read the file. You can specify one of the following:

Only one ON FILE ERROR clause is permitted.

PREVIEW option Displays the layout of input into the destination table including starting position, name, and data type of each column. Sybase IQ displays this information at the start of the load process. If you are writing to a log file, this information is also included in the log. This option is especially useful with partial-width inserts.

ROW DELIMITED BY option Specifies a string up to 4 bytes in length that indicates the end of an input record. You can use this option only if all fields within the row are any of the following:

You cannot use this option if any input fields contain binary data. With this option, a row terminator causes any missing fields to be set to NULL. All rows must have the same row delimiters, and it must be distinct from all column delimiters. The row and field delimiter strings cannot be an initial subset of each other. For example, you cannot specify “*” as a field delimiter and “*#” as the row delimiter, but you could specify “#” as the field delimiter with that row delimiter.

If a row is missing its delimiters, Sybase IQ returns an error and rolls back the entire load transaction. The only exception is the final record of a file where it rolls back that row and returns a warning message. On Windows, a row delimiter is usually indicated by the newline character followed by the carriage return character. You may need to specify this as the delimiter-string (see above for description) for either this option or FILLER.

SKIP option Lets you define a number of rows to skip at the beginning of the input table(s) for this load. The default is 0.

START ROW ID option Specifies the record identification number of a row in the IQ table where it should start inserting. This option is used for partial-width inserts, which are inserts into a subset of the columns in the table. By default, new rows are inserted wherever there is space in the table, and each insert starts a new row. Partial-width inserts need to start at an existing row. They also need to insert data from the source file into the destination table positionally by column, so you must specify the destination columns in the same order as their corresponding source columns. Define the format of each input column with a column-spec. The default is 0. For more information about partial-width inserts see Chapter 7, “Moving Data In and Out of Databases” of the Sybase IQ System Administration Guide.

NoteUse the START ROW ID option for partial-width inserts only! If the columns being loaded already contain data, the insert will fail. For example:

UNLOAD FORMAT option Specifies that the file has IQ internal unload formats for each column created by an earlier version of IQ (before Version 12.0). This load option has the following restrictions:

IGNORE CONSTRAINT option Specifies whether to ignore CHECK, UNIQUE, NULL, DATA VALUE, and/or FOREIGN KEY integrity constraint violations that occur during a load and the maximum number of violations to ignore before initiating a rollback. Specifying each constrainttype has the following result:

If CHECK, UNIQUE, NULL, or FOREIGN KEY is not specified in the IGNORE CONSTRAINT clause, then the load rolls back on the first occurrence of each of these types of integrity constraint violation.

If DATA VALUE is not specified in the IGNORE CONSTRAINT clause, then the load rolls back on the first occurrence of this type of integrity constraint violation, unless the database option CONVERSION_ERROR = OFF. If CONVERSION_ERROR = OFF, then a warning is reported for any DATA VALUE constraint violation and the load continues.

When the load completes, an informational message regarding integrity constraint violations is logged in the .iqmsg file. This message contains the number of integrity constraint violations that occurred during the load and the number of rows that were skipped.

MESSAGE LOG option Specifies the names of files in which to log information about integrity constraint violations and the types of violations to log. Timestamps indicating the start and completion of the load are logged in both the MESSAGE LOG and the ROW LOG files. Both MESSAGE LOG and ROW LOG must be specified, or no information about integrity violations is logged.

Various combinations of the IGNORE CONSTRAINT and MESSAGE LOG options result in different logging actions, as indicated in Table 6-11.

Table 6-11: LOAD TABLE logging actions

IGNORE CONSTRAINT specified?

MESSAGE LOG specified?

Action

yes

yes

All ignored integrity constraint violations are logged, including the user specified limit, before the rollback.

no

yes

The first integrity constraint violation is logged before the rollback.

yes

no

Nothing is logged.

no

no

Nothing is logged. The first integrity constraint violation causes a rollback.

NoteSybase strongly recommends setting the IGNORE CONSTRAINT option limit to a non-zero value, if you are logging the ignored integrity constraint violations. If a single row has more than one integrity constraint violation, a row for each violation is written to the MESSAGE LOG file. Logging an excessive number of violations affects the performance of the load.

LOG DELIMITED BY option Specifies the separator between data values in the ROW LOG file. The default separator is a comma.

For more details on the contents and format of the MESSAGE LOG and ROW LOG files, see “Bulk loading data using the LOAD TABLE statement” in Chapter 7, “Moving Data In and Out of Databases”of the Sybase IQ System Administration Guide.


Side effects

None.

Standards

Permissions

The permissions required to execute a LOAD TABLE statement depend on the database server -gl command line option, as follows:

For more information, see the -gl command line option in “Server command-line options” on page 7 in Chapter 1, “Running the Database Server” of the Sybase IQ Utility Guide.

LOAD TABLE also requires an exclusive lock on the table.

See also

INSERT statement

“NON_ANSI_NULL_VARCHAR option”

“Bulk loading data using the LOAD TABLE statement” in Chapter 7, “Moving Data In and Out of Databases” of the Sybase IQ System Administration Guide

“Monitoring disk space usage” in Chapter 1, “Troubleshooting Hints” of the Sybase IQ Troubleshooting and Error Messages Guide