Table of Contents
This chapter describes tbLoader and its usage.
tbLoader is a utility designed to load a large amount of data into a Tibero database at once. With tbLoader, the user does not need to create SQL statements to insert data into the database. Instead, column data can be converted into a general text file and loaded into the database. tbLoader is useful when loading a lot of data into Tibero database.
tbLoader enables the user to easily create data files and reduce the data loading time.
tbLoader is automatically installed and uninstalled with Tibero.
The following shows how tbLoader can be executed through a command prompt.
$ tbloader [options]
For detailed information about the command-line options, refer to “5.7. Command Line Parameters”.
The following shows an example of executing tbLoader.
[Example 5.1] Executing tbLoader
$ tbloader userid=db_user/db_password@default
control=sample.ctl data=sample.data direct=Y
tbLoader receives control and data files and outputs log and error files. Control and data files are written by the user, while log and error files are automatically created by tbLoader. I/O files in tbLoader are all text files, which can be easily created by the user.
This section describes the input (control and data files) and output files (log and error files) needed in tbLoader.
The control file defines the parameters that are used to execute tbLoader. In the control file, the user specifies which data to read, and how to read and where to save the data. For more information on configuring control file parameters, refer to “Specifying Control File Options” in “5.9. Control File Options”.
The data file is a text file that contains the data to insert into a database table. The data file uses the SPOOL commands of tbSQL to save the SQL query results. The data file can also be manually created using a general text editor. The data file can be saved in either fixed-length or delimited record format.
Fixed-length record format is used when the user specifies the POSITION property for all columns in the control file. Instead of using a delimiter, data values are read from the specified positions. The column position is determined by the byte length. This format is less flexible than a delimited record format but has higher performance.
The record size or the end of line (EOL) character can be specified to extract each record. To enhance performance, the system reads data from the column position specified in the control file, ignoring the ESCAPED BY and LINE STARTER parameter values.
A fixed-length record can only be used with the POSITION clause, not the FIELDS or LINE STARTED BY clause (refer to “5.9. Control File Options”).
Using LINES FIX clause
The LINES FIX clause is used for fixed-length (number of bytes) records. The record length must be specified in the control file, e.g., LINES FIX 12. The LINES TERMINATED BY clause is ignored.
The following is an example of loading fixed-length records.
example.ctl: LOAD DATA INFILE 'example.dat' LOGFILE 'example.log' BADFILE 'example.bad' APPEND INTO TABLE EMP LINES FIX 12 ( empno position (01:04), ename position (06:12) ) example.dat: 7922 MILLER 7777 KKS 7839 KING 7934 MILLER 7566 JONES
Not using LINES FIX clause
When the record size is not specified in the control file, the EOL character ('\n') is used as a delimiter.
The following is an example of using the EOL character as a delimiter.
example.ctl: LOAD DATA INFILE 'example.dat' LOGFILE 'example.log' BADFILE 'example.bad' APPEND INTO TABLE EMP TRAILING NULLCOLS ( empno position (01:04), ename position (06:15), job position (17:25), mgr position (27:30), sal position (32:39) )
example.dat: 7922 MILLER CLERK 7782 920.00 7777 KKS CHAIRMAN 7839 KING PRESIDENT 5500.0 7934 MILLER CLERK 7782 920.00 7566 JONES MANAGER 7839 3123.75 7658 CHAN ANALYST 7566 3450.00 7654 MARTIN SALESMAN 7698 1312.50
In the example.dat file, the second line does not have values in the last two columns. In this case, an error occurs if TRAILING NULLCOLS is not specified. For more information about TRAILING NULLCOLS refer to “5.9.18. TRAILING NULLCOLS”.
A delimited record format is used if the POSITION is not specified for all columns in the control file. Each field is separated by the FIELD TERMINATOR (field delimiter) and each record is separated by the LINE TERMINATOR (record delimiter). A delimited record format reduces performance compared to a fixed-length record format, but provides higher record format flexibility.
The following is an example of loading delimited records.
example.ctl: LOAD DATA INFILE 'example.dat' LOGFILE 'example.log' BADFILE 'example.bad' APPEND INTO TABLE emp FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '\\' LINES TERMINATED BY '\n' ( empno, ename, job, mgr, hiredate, sal, comm, deptno )
example.dat: 7654, "Martin", "Sales",7698,1981/10/28,1312.50,3,10 7782, "\,Clark","Manager" ,7839, 1981/01/11 ,2572.50,10,20 7839, "King",President,,1981/11/17,5500.00,,10 7934,"Miller","Clerk",7782 ,1977/10/12,920.00,,10 7566, "Jones",Manager" ,7839, 1981/04/02,3123.75,,20 7658, "Chan", Ana lyst, 7566,1982/05/03,3450,,20
tbLoader execution process is recorded in the log file. The log file provides users with basic metadata of the input columns and statistics about successful and failed input records. tbLoader analyzes the failed records and provides the details of the cause.
tbLoader can load data using two methods:
The conventional path load method is the default data loading method provided by tbLoader. It reads column data from the specified data file, saves them in a column array, and loads them into the database server using batch update. This method has lower performance than the direct path load but it can be used without any constraints.
The direct path load method reads a data file specified by the user. The data is then loaded into a column array so that the data matches the column data type. The column array data is converted into Tibero database block type through a block formatter, and written to the Tibero database.
The direct path load method is significantly faster than the conventional path load method, but has the following constraints:
The CHECK and referential constraints are not checked.
The Primary Key, Unique Key, and NOT NULL constraints are checked.
The insert trigger does not execute during data loading.
The following section describes the constraints of tbLoader.
If the FIELD TERMINATOR, ENCLOSED BY, ESCAPED BY, or LINE TERMINATOR BY parameter values are the same, the data file cannot be read correctly. If the FIELD TERMINATED BY and ENCLOSED BY parameters have the same value as in the following example, an error occurs.
FIELDS TERMINATED BY '"' OPTIONALLY ENCLOSED BY '"'
If a character specified with ENCLOSED BY, FIELD TERMINATED BY, or LINE TERMINATED BY is used as an input field value in the data file, it must be specified with ESCAPE BY clause to be treated as data. For more information, refer to “5.9.14. FIELDS ESCAPED BY”.
example.ctl:
FIELDS TERMINATED BY ','
(ID, NAME)
example.dat:
7654,Martin, Kim
If the table owner and user who is trying to load the table data are different, the user must have the LOCK ANY TABLE privilege to access the table in the DPL mode. To ensure high performance via DPL method, it is the server policy to obtain the table lock before loading data in order to prevent other sessions from executing DML on the target table.
Use the following statement to grant the privilege.
grant lock any table to <user>;
When the user does not have the necessary privilege, the following error occurs.
-17004: Permission denied
This following section describes how tbLoader handles whitespaces.
A field value of the input file corresponds to the column value of a single record. If the field value consists of only whitespace(s), tbLoader loads ‘0’ or NULL according to the data type of input table column.
tbLoader regards a blank space (' '), tab ('\t'), and EOL character ('\n') as a whitespace unless the character is specified with the FIELD TERMINATED BY or LINE TERMINATED BY clause. Whitespaces at the start and end of a field is ignored, but whitespaces in the middle of a field is treated as data.
tbLoader regards whitespaces differently according to the data file type.
Whitespaces preceding a field value is regarded as actual data while whitespaces at the end is regarded as unnecessary and are truncated. The leading whitespaces can be removed by the user, but trailing whitespaces may exist intentionally for formatting purposes.
The following code illustrates the how whitespaces are handled for a fixed-length record.
" aaa \t" -> " aaa"
All whitespaces before and after the field value is regarded as unnecessary and are truncated. To insert whitespaces before or after the field value as data, enclose the whitespaces with the ENCLOSED BY string.
When there are whitespaces before and after a field value, whitespaces is truncated as follows:
" aaa \t" -> "aaa"
As previously mentioned, tbLoader truncates data but whitespaces before and after a field value can be treated as data by using the PRESERVE BLANKS clause.
The following section describes tbLoader parameters that can be entered at the command line.
If a parameter is not specified when executing tbLoader, it can be entered from the command line. The following describes the command line parameters.
$ tbloader tbLoader 6 TmaxData Corporation Copyright (c) 2008-. All rights reserved. Usage: tbloader [options] [controls] Options: -h|--help Display the more detailed information. -c|--charset Display usable character sets in this version. -v|--version Display the version information. Controls: userid Username/password@dbname control The name of the control file data The name of the data file log The name of the log file bad The name of the bad file discard The name of the discard file skip The count of line to skip in data file (Default: 0) errors The count of error to allow (Default: 50) rows The count of row for each commit (Default: commit after the end) message The count of records to process to print out (Default: 0) readsize The size of buffer to read from the data file (Default: 65536) disable_idx Whether to disable indexes before loading (Default: N) direct Whether to use the Direct Path Loading (Default: N) dpl_log Whether to log about the Direct Path Loading (Default: N) parallel The count of threads to execute Parallel Direct Path Loading (Default: 1) bindsize The size of buffer to read in the Direct Path Loading (Default: 65536) dpl_parallel Deprecated multithread Deprecated Example: tbloader userid=userid/passwd@dbname control=sample.ctl bindsize=1000000
At the command line, the user can input the metadata required for the parameters or control file. The USERID, CONTROL, DIRECT, MESSAGE, READSIZE, BINDSIZE, ERRORS, and ROWS parameters can be specified only from the command line.
The -c option displays information about the character sets that are available in the current version for backward compatibility.
The following shows an example of tbLoader parameters that can be entered at the command line.
[Example 5.2] tbLoader Command Line Parameters
userid=userid/passwd@dbname control=/home/test/control.ctl control=../control.ctl log=/home/test/control.log log=../control.log data=/home/test/data.dat data=../data.dat bad=/home/test/data.bad bad=../data.bad discard=/home/test/data.dsc discard=../data.dsc skip=1 direct=Y dpl_log=Y message=10000 readsize=65536 bindsize=65536 errors=100 rows=100 parallel=2
Item | Description |
---|---|
userid | Tibero database user name and password. Usage: userid=userid/passwd@databasename |
control | Path and name of the control file that contains parameter information. Use an absolute or relative path. |
data | Path and name of the text file that contains actual data. Use an absolute or relative path. If the path is entered at the command prompt and in the control file, the former is used. |
log | Path and name of the file that records the data loading process in tbLoader. (Default value: <Control file name>.log) Use an absolute or relative path. If the path is entered at the command prompt and in the control file, the former is used. |
bad | Path and name of the text file that records data load failures. (Default value: <Data file name>.bad> Use an absolute or relative path. If the path is entered at the command prompt and in the control file, the former is used. |
discard | Path and name of the file that contains records omitted by a WHEN conditional clause. If this parameter is not specified, omitted records are not recorded. Use an absolute or relative path. If the path is entered at the command prompt and in the control file, the former is used. |
skip | Skips (ignores) a specified number of lines from a data file, starting from the first line. (Default value: 0) Skip is same as the control file option, “5.9.19. IGNORE LINES”. |
errors | Maximum number of errors allowed when uploading data. Default value: 50 tbLoader loads data until the number of errors reaches the user specified maximum number.
|
rows | Number of records to commit, when uploading a large amount of data. Considering the tbLoader performance, the actual number of records sent to the server may not be same as this value. |
message | Parameter used to display the number of logical records tbLoader is currently processing. If not otherwise specified, the progress will not be displayed. Using a number that is too small may affect performance. |
readsize | Size of the read-only buffer that tbLoader uses to read the data file. The unit is byte. The maximum value is 2,097,152 (2MB). (Default value: 65536 (64KB)) |
disable_idx | Option to disable all indexes of the target table before loading data.
|
direct | Option to use the direct path load method for data loading.
|
dpl_log | Option to record logs in the log file of the server when loading data using direct path load.
|
parallel | Number of threads used when using direct path load. (Default value: 1) tbLoader uses as many thread(s) that read data and thread(s) that load the data to the server as the specified value in parallel. |
bindsize | Size of the direct path stream used by the client when direct path load is used. The Tibero client does not upload the data to the server until the size of the bound data reaches this value. This parameter helps increase efficiency when uploading large amounts of data. Unit: byte, Default value: 65536 (64KB), Maximum value: 10,485,760 (10MB) |
dpl_parallel | Deprecated. |
multithread | Deprecated. |
This section describes the advanced functions added to tbLoader.
If direct path load is used to transfer data, the target table is locked to prevent concurrent data loading into the same table. Parallel DPL can be used for concurrent data loading.
In Parallel DPL, the Parallel DPL flag is set in the target data, the server receives data in parallel, and the data is merged and then saved. To use parallel DPL, direct must be set to 'Y' and parallel must be set to a value greater than 1 in tbloader.
The following example shows how to use Parallel DPL.
[Example 5.3] Using Parallel DPL in tbLoader
$ tbloader userid=db_user/db_password@default
control=sample.ctl data=sample.data direct=Y parallel=2
tbLoader can be used to encrypt database access information (connect_string) using an encryption file (wallet).
To use this function, the encryption file (wallet) that stores the access information must be created by referencing Creating an Encrypted File.
The following example shows how to access the encryption file.
[Example 5.4] Using Encryption File (wallet)
$ export LR_WALLET_PATH=./wallet_file.dat $ tbloader control=sample.ctl data=sample.data
Currently, this function is supported only in UNIX with Open SSL installed.
The following section describes how to specify control file options. A user can specify the following options in the control file:
Character sets included in the data file
Data file that contains the data to load
Log file that records the events during data loading
Error file that records data that failed to load
Error number to exclude from the error file
How to process existing data in a table (APPEND|REPLACE|TRUNCATE|MERGE)
Field delimiters and other options (TERMINATOR, ENCLOSED BY STRING, ESCAPED BY STRING)
Start and end of line strings
Number of lines to ignore from the start of the data file
Options for a specific table column
The following is the control file format, and the order of the options must be preserved. The parts enclosed by square brackets ([ ]) are optional.
LOAD DATA [CHARACTERSET characterset_name] [INFILE data_file_name] [LOGFILE log_file_name] [BADFILE bad_file_name] [DISCARDFILE discard_file_name] [SKIP_ERRORS error_number, ...] [APPEND|REPLACE|TRUNCATE|MERGE(column_name, .....)] [PRESERVE BLANKS] INTO TABLE table_name [WHEN filter_expression [AND filter_expression...]] [MULTI INSERT INDEXES|FAST BUILD INDEXES] [FIELDS [TERMINATED BY field_terminator] [OPTIONALLY ENCLOSED BY enclosed_by_start_string [AND enclosed_by_end_string]] [ESCAPED BY escaped_by_string]] [LINES [FIX number] [STARTED BY line_start_string] [TERMINATED BY line_terminator_string]] [TRAILING NULLCOLS] [IGNORE number LINES] (column_name [FILLER] [POSITION(from:to)] [INTEGER EXTERNAL(size) | FLOAT EXTERNAL(size) | DOUBLE EXTERNAL(size) | CHAR(size) | RAW(size) | DATE(size) date_fmt_string | TIMESTAMP(size) timestamp_fmt_string | TIME(size) time_fmt_string] [OUTFILE] [CONSTANT constant_value] [NULL TERMINATED] [PRESERVE BLANKS] [sql_expression] [NULLIF cond_expression], ..........) -- This line is a comment.
tbLoader can load data of various character sets into Tibero server. This option affects the character sets of the control and data files. If the parameter value is not specified, TB_NLS_LANGUP_NLS_LANG in the dsn file is used as the default character set of data files.
The following describes how to set a character set:
Syntax
CHARACTERSET characterset_name
Parameter | Description |
---|---|
characterset_name | Name of the character set, which can be used as TB_NLS_LANG. (Default value: TB_NLS_LANG value in the environment configuration file or dsn file) |
Example
If the character set of the current client is set to KSC5601 (TB_NLS_LANG =EUCKR), add the following to the control file to load a MSWIN949 data file into Tibero database.
CHARACTERSET MSWIN949
The data_file_name parameter specifies the path and name of the data file (text file) containing actual data. The path can be an absolute or relative path.
If the path is entered at the command prompt and in the control file, the former is used.
The following describes how to specify the data file name and path:
Syntax
INFILE data_file_name
Parameter | Description |
---|---|
data_file_name | Path and name of the data file. |
Example
INFILE '/home/test/data.dat' INFILE '../data.dat'
The log_file_name parameter specifies the path and name of the log file that records the data loading process in tbLoader. The path can be an absolute or relative path.
If the path is entered at the command prompt and in the control file, the former is used.
The following describes how to specify the log file:
Syntax
LOGFILE log_file_name
Parameter | Description |
---|---|
log_file_name | Path and name of the log file. (Default value: <control file name>.log) |
Example
LOGFILE '/home/test/control.log' LOGFILE '../control.log'
The bad_file_name parameter specifies the path and name of file that records data loading failures. The path can be an absolute or relative path.
If the path is entered at the command prompt and in the control file, the former is used.
If an error occurs while using DPL for data loading, failed records are not logged. This is because identifying failed rows is difficult in DPL, which is designed for high performance.
The following describes how to specify the bad (error) file:
Syntax
BADFILE bad_file_name
Parameter | Description |
---|---|
bad_file_name | Path and name of bad file. (Default value: <data file name>.bad) |
Example
BADFILE '/home/test/data.bad' BADFILE '../data.bad'
The discard_file_name parameter specifies the path and name of file that contains records omitted by a WHEN conditional clause. The path can be an absolute or relative path.
If the path is entered at the command prompt and in the control file, the former is used.
The following describes how to specify the file:
Syntax
DISCARDFILE discard_file_name
Parameter | Description |
---|---|
discard_file_name | Path and name of the file. |
Example
DISCARDFILE '/home/test/data.dsc' DISCARDFILE '../data.dsc'
The error_number parameter specifies the error numbers that will not be logged in the error file.
The following describes how to specify the error number(s):
Syntax
SKIP_ERRORS error_number, ...
Parameter | Description |
---|---|
error_number | Error numbers to exclude from the error file. |
Example
SKIP_ERRORS -10007 SKIP_ERRORS -10007, -10005
A data processing method can be specified for existing table data.
The REPLACE and TRUNCATE options cause a deleted record to be automatically committed. Hence, data cannot be recovered after executing tbLoader.
The following describes how to specify the processing of existing data:
Syntax
APPEND|REPLACE|TRUNCATE|MERGE(column_name, .....)
Parameter | Description |
---|---|
APPEND | Appends new data to the existing data. (Default value) |
REPLACE | Replaces existing data with new data using the DELETE command. This requires DELETE permissions. If a trigger is associated with the DELETE command, the trigger will be executed. If the table has a referential integrity constraint, it must be disabled before using the DELETE option. |
TRUNCATE | Deletes the existing data with the TRUNCATE clause and inserts new data. Note: If the table has a referential integrity constraint, it must be disabled before using the TRUNCATE option. |
MERGE(column_name, .....) | List of columns to merge. tbLoader uses the specified column list as key values. If the new data has the same key value as the existing data, the new data overwrites the existing data. If there is no matching key value, the data is inserted as a new record. |
Example
control.ctl: LOAD DATA ... REPLACE ... (...)
control.ctl: LOAD DATA ... MERGE(EMPNO, EMPNM) ... (...)
PRESERVE BLANKS clause is used to enter field data into the database by preserving white spaces.
Example
control.ctl: LOAD DATA ... PRESERVE BLANKS ... (...)
The table_name parameter specifies the target table name.
The following describes how to specify a table name:
Syntax
INTO TABLE table_name
Parameter | Description |
---|---|
table_name | Target table name. A PUBLIC synonym can be used. |
Example
control.ctl: LOAD DATA ... INTO TABLE EMP ... (...)
Before loading a data file, the field value conditions can be specified.
The following describes how to set the conditions:
Syntax
WHEN filter_expression [AND filter_expression...] filter_expression: {column_name|'(' column_pos ')'} operator value_string
Parameter | Description |
---|---|
column_name | Column name to compare. |
column_pos | Column number to compare. Starts with 1 and is enclosed by parenthesis. |
operator | Comparison operator. Use an equality (=) or inequality (!=, <>) operator. |
value_string | Value to compare to. Must use a string enclosed with single quotes. |
Example
control.ctl: LOAD DATA ... INTO TABLE EMP WHEN C1 = '1' ... (...)
The following describes how to create indexes when loading data into the target table using the direct path load method. Select either the multi insert or fast build method.
When using tbLoader, it is better to use the MULTI INSERT method when loading data from a table with large amount of data. Otherwise, the FAST BUILD method is recommended. Note that when using DPL, the index state may change to unusable due to issues such as data redundancy.
The following describes how to create indexes:
Syntax
[MULTI INSERT INDEXES|FAST BUILD INDEXES]
Parameter | Description |
---|---|
MULTI INSERT INDEXES | Creates indexes with the MULTI INSERT method. |
FAST BUILD INDEXES | Creates indexes with the FAST BUILD method. |
Example
control.ctl: LOAD DATA ... MULTI INSERT INDEXES ... (...)
When reading records from a data file, each field is delimited by the FIELD TERMINATED BY parameter value.
The following describes how to define the field_terminator parameter:
Syntax
FIELDS TERMINATED BY field_terminator
Parameter | Description |
---|---|
field_terminator | ASCII character string. |
Example
control.ctl: LOAD DATA ... FIELDS TERMINATED BY ',' ... (...)
The FIELDS OPTIONALLY ENCLOSED BY clause specifies the character strings that enclose the start and end of a field value when reading records from a data file.
tbLoader treats a field value that contains meta-character strings, such as blank space (' ', '\t', '\r', '\n'), field terminator, and line terminator, as data. An ESCAPED BY string, however, is treated as a meta-character string even if the string is enclosed by the ENCLOSED BY string.
If a field value contains the ENCLOSED BY string, the ESCAPED BY value must be used as its prefix. Otherwise, the string is treated as the ENCLOSED BY string and the rest of the field value is not considered as data.
The following describes how to use the FIELDS OPTIONALLY ENCLOSED BY clause:
Syntax
FIELDS OPTIONALLY ENCLOSED BY enclosed_by_start_string
Parameter | Description |
---|---|
enclosed_by_start_string | ASCII character string. |
Example
The following specifies the same start and end strings:
control.ctl: LOAD DATA ... FIELDS OPTIONALLY ENCLOSED BY '"' ... (...)
The following specifies different start and end strings:
control.ctl: LOAD DATA ... FIELDS OPTIONALLY ENCLOSED BY '{$' AND '$}' ... (...)
For the following control file, the data is specified as below.
control.ctl: LOAD DATA ... FIELDS OPTIONALLY ENCLOSED BY '"' ESCAPED BY '\\' ... (...)
To ensure that a field value that contains the ENCLOSED BY value is processed correctly, use the ESCAPED BY string as the prefix.
”quotation mark[\”]”,0001→(quotation mark["],0001) ”quotation mark[”]”,0001→Error occurs
Special characters that come after the ESCAPE BY string are read as data. Use a double backslash '\\' to escape a back slash ('\').
The following describes how to set the ESCAPED BY string:
Syntax
ESCAPED BY escaped_by_string
Parameter | Description |
---|---|
escaped_by_string | ASCII character string. |
Example
ESCAPED BY '\\' ESCAPED BY '$$!'
The LINES FIX clause specifies the length of a line in a data file. The FIELDS, LINE TERMINATED BY and LINE STARTED BY clauses cannot be used with this clause.
The following describes how to use the LINES FIX clause:
Syntax
LINES FIX number
Parameter | Description |
---|---|
number | Integer value. (Unit: byte) |
Example
control.ctl: LOAD DATA ... LINES FIX 5 ... (...)
Specify the LINES FIX parameter as shown in the previous example. Assume that the data file contains the following data.
data.dat: abcdefghijklmnopqrst
tbLoader reads the data file and loads the following data:
LINE #1: abcde LINE #2: fghij LINE #3: klmno LINE #4: pqrst
The LINES STARTED BY clause specifies a prefix that indicates where to start reading data on a line when reading data line by line from a data file. If a line does not contain the prefix, the entire line is ignored. For performance reasons, a prefix must consist of a sequence of the same character.
The line start string can be specified as follows:
Syntax
LINES STARTED BY line_start_string
Parameter | Description |
---|---|
line_start_string | ASCII character string (maximum 30 bytes). |
Example
control.ctl: LOAD DATA ... LINES STARTED BY '$$$' ... (...)
Specify the parameter as in the previous example, and assume that the data file contains the following data.
data.dat: $$$0001,”SMITH” … something … $$$0002,”DAVID”
tbLoader reads the data file and loads the following data:
(0001, “SMITH”), (0002, “DAVID”)
When reading data from a data file, the LINE TERMINATOR signals the end of a record.
The line end string can be specified as follows:
Syntax
LINES TERMINATED BY line_terminator_string
Parameter | Description |
---|---|
line_terminator_string | ASCII character string. (Default value: '\n'). |
Example
The following is an example of specifying the LINE TERMINATOR string as '|\n'.
control.ctl: LOAD DATA ... LINES TERMINATED BY '|\n' ... (...)
In Windows files, use '|\r\n' as the line terminator.
TRAILING NULLCOLS clause treats a missing value in the last column of a record as a null column instead of as an error.
The following describes how to specify the TRAILING NULLCOLS clause:
Syntax
TRAILING NULLCOLS
Example
The following is an example of setting the ‘job’ column of a record that is missing a value to NULL.
control.ctl: LOAD DATA ... TRAILING NULLCOLS ... ( id, name, job )
data.dat: 10 SMITH
The IGNORE LINES clause ignores the specified number of lines from the start of the data file.
The following describes how to specify the number of lines to ignore:
Syntax
IGNORE number LINES
Parameter | Description |
---|---|
number | Integer value. (Default value: 0) |
Example
If the first line of a data file contains column names, specify as follows to ignore the first line.
control.ctl: LOAD DATA ... IGNORE 1 LINES ... (...)
The column properties of the target table are specified in the control file. Note that the columns must be specified in the order they appear in the data file.
The following sections describe the column properties.
The following describes how to specify the columns and their properties:
Syntax
(column_name [FILLER] [POSITION(from:to)] [INTEGER EXTERNAL(size) | FLOAT EXTERNAL(size) | DOUBLE EXTERNAL(size) | CHAR(size) | RAW(size) | DATE(size) date_fmt_string | TIMESTAMP(size) timestamp_fmt_string | TIME(size) time_fmt_string] [OUTFILE] [CONSTANT constant_value] [NULL TERMINATED] [PRESERVE BLANKS] [sql_expression] [NULLIF cond_expression], ..........)
Parameter | Description |
---|---|
column_name | Column name. |
The FILLER clause specifies a column to exclude from the data file.
The following describes the FILLER clause.
Syntax
FILLER
Example
For a fixed-length record, specify the column as FILLER before the position clause.
control.ctl: LOAD DATA ... ( empno position (01:04), ename position (06:15), job filler position (17:25), mgr position (27:30), sal position (32:39), comm position (41:48), deptno position (50:51) )
For a delimited record, specify the column as FILLER after the column name.
control.ctl: LOAD DATA ... ( empno, ename, job filler, mgr, sal, comm, deptno )
Specifies the start and end positions of the column in a line of the data file.
The following describes how to specify the POSITION:
Syntax
POSITION(from:to)
Parameter | Description |
---|---|
from | Start position of the column in the line. The line starts at 1. |
to | Last position of the column in the line. |
Example
For a fixed-length record, specify the exact position in the column list as follows:
control.ctl: LOAD DATA ... ( empno position (01:04), ename position (06:15), job position (17:25), mgr position (27:30), sal position (32:39), comm position (41:48), deptno position (50:51) )
For a delimited record, the POSITION clause is not required.
control.ctl: LOAD DATA ... ( empno, ename, job, mgr, sal, comm, deptno )
tbLoader supports a set of data types. The default values and column data binding methods vary by data type.
The following are the data types provided by tbLoader:
Numeric data type
Character string data type
Binary data type
DATETIME data type
Numeric data type is used to load character string data into numeric columns in Tibero. A NULL value is bound to 0 by default and loaded into the database. To insert a NULL value, the column must be specified as a character string data type.
The following describes how to specify a numeric type column:
Syntax
INTEGER EXTERNAL(size)|FLOAT EXTERNAL(size)|DOUBLE EXTERNAL(size)
Parameter | Description |
---|---|
INTEGER EXTERNAL(size) | Used when the character string data is an integer type. |
FLOAT EXTERNAL(size) | Used when the character string data is a floating type. |
DOUBLE EXTERNAL(size) | Used when the character string data is a double precision floating type. |
For more information about the size option, refer to “5.9.20.4. Data Buffer Size”.
Character string data type is used to load character string data into character type (CHAR, VARCHAR, CLOB, and NCLOB) columns in Tibero. A NULL value is inserted into the database as NULL. To insert a 0 for a NULL value, the column must be specified as a numeric data type.
The following describes how to specify a character string type column:
Syntax
CHAR(size)
Parameter | Description |
---|---|
CHAR(size) | Used for character string data. For more information about the size option, refer to “5.9.20.4. Data Buffer Size”. |
Binary data type is used to load binary data into large object type columns (RAW, BLOB, and LONGRAW) in Tibero. A NULL value is inserted into the database as NULL.
The following describes how to specify a binary type column:
Syntax
RAW(size)
Parameter | Description |
---|---|
RAW(size) | Used for binary data. For more information about the size, refer to “5.9.20.4. Data Buffer Size”. |
The DATETIME data type is used to load character string data into date type (DATE, TIME, and TIMESTAMP) columns in Tibero. A NULL value is inserted into the database as NULL.
Tibero client uses the TB_NLS_DATE_FORMATUP_NLS_DATE_FORMAT and TB_NLS_TIMESTAMP_FORMATUP_NLS_TIMESTAMP_FORMAT clauses to declare DATE and TIMESTAMP type columns. A DATE, TIMESTAMP, or TIME type columns can be specified directly in the control file of tbLoader. The specified column types must be enclosed by double quotes (" ").
The following describes how to specify a DATETIME column:
Syntax
DATE date_fmt_string|TIMESTAMP timestamp_fmt_string|TIME time_fmt_string
Parameter | Description |
---|---|
date_fmt_string | DATE type column. |
timestamp_fmt_string | TIMESTAMP type column. |
time_fmt_string | TIME type column. |
Example
The following example shows a sample control file of tbLoader. This example specifies the date format, 'YYYYMMDD', for the hiredate column in the data.dat file.
control.ctl: LOAD DATA ..... INFILE 'data.dat' ..... ( empno, ename, hiredate DATE "YYYYMMDD" )
data.dat: 1111, JOHN, 19981112 2222, TOM, 20070802
tbLoader reads data from the data file and saves them in the data buffer. tbLoader then converts the data into a Tibero server data type and loads the data into the server.
To read delimited records, tbLoader uses the database schema information to allocate an appropriate buffer size. To read fixed-length records, tbLoader uses the start and end values of the POSITION clause to allocates the buffer size.
If the data length is known, the user can specify the data buffer size. The user can also specify the data buffer size (in bytes) to read large object data types such as BLOB, CLOB, LONG, and LONG RAW.
Example
In the following control file, 5 and 10 bytes of data buffer are allocated to empno and hiredate columns, respectively.
control.ctl: LOAD DATA .... ( empno INTEGER EXTERNAL(5), hiredate DATE(10) "YYYY/MM/DD" )
The OUTFILE property can be specified to read large object data types (BLOB, CLOB, LONG, and LONG RAW type) or binary data such as RAW from another file instead of the data file by specifying the file path in the data file.
If the OUTFILE property is set, tbLoader internally buffers the data from the file and loads them to the server in multiple operations. Otherwise, tbLoader directly reads large object data types from the data file and allocates a 32 KB data buffer, by default. To read larger data, the data buffer size, “5.9.20.4. Data Buffer Size”, can be set manually.
Example
For the following control file,
control.ctl: LOAD DATA ..... INFILE 'data.dat' ..... ( empno, ename, resume OUTFILE )
tbLoader reads data from a file if the file path is set in the resume column of the data.dat file.
data.dat 1111, JOHN, ./clobdata.txt 2222, TOM, /home/test/clobdata.txt
The value of a specific column can be specified as CONSTANT regardless of the data file value. The value must be wrapped in double quotes (" ") because it may include a blank character.
Example
In the following control file, the column (empno) value is set to 1234 regardless of the data value in the data.dat file.
control.ctl: LOAD DATA ..... INFILE 'data.dat' ..... ( empno constant "1234", ename )
Used as a column property, PRESERVE BLANKS, preserves whitespaces in the specified column instead of all fields.
Example
In the following control file, whitespaces are preserved in the first column 'empno'.
control.ctl: LOAD DATA ... ( empno PRESERVE BLANKS, ename )
A SQL expression can be used for a column value. The SQL expression must be wrapped in double quotes (" "), and data can be bound to the column by using a colon (:) with the column name in the expression.
Example
In the following control file, the TO_CHAR function sets the SYSDATE value to the 'empno' column as a character string of the format 'YYYYMMDD'. The 'empno' column is bound to the 'ename' column from the data file.
control.ctl: LOAD DATA ... ( empno "TO_CHAR(SYSDATE, 'YYYYMMDD')", ename ":empno" )
Specifies a conditional statement to replace a specific column value with NULL.
The following describes the NULLIF clause.
Syntax
NULLIF {column_name|'(' column_pos ')'} operator value_string
Parameter | Description |
---|---|
column_name | Column name to compare. |
column_pos | Column number to compare. Starts with 1, enclosed with parenthesis. |
operator | Comparison operator. Use an equality (=) or inequality (!=, <>) operator. |
value_string | Value to compare to. String must be enclosed with single quotes. |
Example
In the following control file, 0 in the empno column is replaced with NULL.
control.ctl: LOAD DATA ... ( empno CHAR, ename CHAR NULLIF empno = '0' )
This section describes examples of loading delimited records, fixed-length records, and large object data types (BLOB and CLOB).
Create a table (used in all examples).
Create a control file.
Create a data file.
Execute tbLoader.
Check the log and error files.
The following is an example of creating a table. In this example, the database name is the default value and the table owner is loader/loader_pw.
CREATE TABLE MEMBER ( ID NUMBER(4) NOT NULL, NAME VARCHAR2(10), JOB VARCHAR2(9), BIRTHDATE DATE, CITY VARCHAR2(10), PICTURE BLOB, AGE NUMBER(3), RESUME CLOB ); CREATE TABLE CLUB ( ID NUMBER(6) NOT NULL, NAME VARCHAR2(10), MASTERID NUMBER(4) );
Use the convention path load method to load delimited records to Tibero server.
The following is the control file, control.ctl.
LOAD DATA INFILE './data.dat' APPEND INTO TABLE club FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '|\n' IGNORE 1 LINES ( id integer external, name, masterid integer external )
The control file includes the following information:
Data file
The data.dat file in the current directory.
Log file
If the user does not enter the log file name at the command prompt, the file is created with the control file name by default. In this example, a file named control.log is created.
Error file
If the user does not enter the error file name at the command prompt, the file is created with the data file name by default. In this example, a file named data.bad is created. However, the error file is not created if there are no error logs.
Target table
The id, name, and masterid column data will be inserted into the club table.
FIELDS TERMINATED BY, FIELDS OPTIONALLY ENCLOSED BY, FIELDS ESCAPED BY, and LINES TERMINATED BY character strings
Use a comma (,) for FIELD TERMINATOR, double quotes (" ") for FIELDS OPTIONALLY ENCLOSED BY, and the newline character (|\n) for LINES TERMINATED BY. Since the FIELDS ESCAPED BY character string is not specified, the default value, (\\), is used.
Loading data
The first line of the data file is ignored due to the IGNORE LINES clause.
The following are the data in the data.dat file.
id name masterid| 111111,FC-SNIFER,2345| dkkkkkkkkkk| 111112,"DOCTOR CLUBE ZZANG",2222| 111113,"ARTLOVE",3333| 111114,FINANCE,1235| 111115,"DANCE MANIA",2456| 111116,"MUHANZILZU",2378| 111117,"INT'L",5555
Execute tbLoader as follows:
tbloader userid=loader/loader_pw@default control=./control.ctl
After executing tbLoader, check the log and error files.
The log file contains the logs generated while executing tbLoader.
tbLoader 6 TmaxData Corporation Copyright (c) 2008-. All rights reserved. Data File : ./data.dat Bad File : ./data.bad Table 'CLUB' was loaded from the data file. Column Name Position DataType ------------------------------ ---------- ------------ ID 1 NUMERIC EXTERNAL NAME 2 CHARACTER MASTERID 3 NUMERIC EXTERNAL Record 2 was rejected. TBR-80053 : Some relatively positioned columns are not present in the record. Record 3 was rejected. TBR-80025 : Column data exceeds data buffer size. Record 6 was rejected. TBR-80025 : Column data exceeds data buffer size. Table 'CLUB' ---------------- 8 Rows were requested to load. 5 Rows were loaded successfully. 3 Rows were failed to load because of some errors Elapsed time was: 00:00:00.407089
The error file contains data that tbLoader failed to load. The failed data can be modified and reloaded into Tibero.
dkkkkkkkkkk| 111112,"DOCTOR CLUBE ZZANG",2222| 111115,"DANCE MANIA",2456|
Use the direct path load method to load fixed-length records to Tibero server.
The following is the control file, control.ctl.
LOAD DATA INFILE '/home/test/data.dat' APPEND INTO TABLE MEMBER ( id position (01:04) integer external, name position (06:15), job position (17:25), birthdate position (27:36), city position (38:47), age position (49:51) integer external )
The control file includes the following information:
Data file
The data.dat file in the /home/test directory.
Log file
If the user does not enter the log file name at the command prompt, the file is created with the control file name by default. In this example, a file named control.log is created.
Error file
If the user does not enter the error file name at the command prompt, the file is created with the data file name by default. In this example, a file named data.bad is created. However, the error file is not created if there are no error logs.
Target table
The name, job, birth date, city, and age column data will be inserted into the member table.
FIELDS TERMINATED BY, FIELDS OPTIONALLY ENCLOSED BY, FIELDS ESCAPED BY, and LINES TERMINATED BY character strings
The aforementioned delimiter parameters are not used here because the data is in fixed-length record format. Since LINES FIX also cannot be used, the EOL character is used to terminate each line.
Loading data
Loads data starting from the first line of the data file.
The following are the data in the data.dat file. Because tbLoader reads in fixed-length records, the column data must be accurately positioned in the data file.
7777 KKS CHAIRMAN 1975-11-18 SEOUL 33 7839 KING PRESIDEN DAEGU 45 7934 MILLER CLERK 1967-01-24 BUSAN 37 7566 JONES MANAGER 7499 ALLEN SALESMAN ddddddddd KYUNG-JU aaaa7654 MARTIN SALESMAN 7648 CHAN ANALYST 1979-10-11 INCHON 28
Execute tbLoader as follows:
tbloader userid=loader/loader_pw@default control=./control.ctl direct=Y
After executing tbLoader, check the log and error files.
The log file contains the logs generated while executing tbLoader.
tbLoader 6 TmaxData Corporation Copyright (c) 2008-. All rights reserved. Data File : ./data.dat Bad File : ./data.bad Table 'MEMBER' was loaded from the data file. Column Name Position DataType ------------------------------ ---------- ------------ ID 1 NUMERIC EXTERNAL NAME 2 CHARACTER JOB 3 CHARACTER BIRTHDATE 4 DATE CITY 5 CHARACTER AGE 6 NUMERIC EXTERNAL Record 4 was rejected. TBR-80053 : Some relatively positioned columns are not present in the record. Record 5 was rejected. TBR-80053 : Some relatively positioned columns are not present in the record. Record 6 was rejected. TBR-80053 : Some relatively positioned columns are not present in the record. Table 'MEMBER' ---------------- 7 Rows were requested to load. 4 Rows were loaded successfully. 3 Rows were failed to load because of some errors Elapsed time was: 00:00:00.338089
In the previous log file, there are 3 rows that failed to load due to the last column of the row being empty. The last column value can be bound to NULL by using the “5.9.18. TRAILING NULLCOLS” clause.
The error file contains data that tbLoader failed to load. The failed data can be modified and reloaded into Tibero.
7566 JONES MANAGER 7499 ALLEN SALESMAN ddddddddd KYUNG-JU aaaa7654 MARTIN SALESMAN
Use the conventional path load method to load fixed-length records to Tibero server.
The following is the control file, control.ctl.
LOAD DATA INFILE './data.dat' APPEND INTO TABLE MEMBER LINES FIX 51 TRAILING NULLCOLS ( id position (01:04) integer external, name position (06:15), job position (17:25), birthdate position (27:36), city position (38:47), age position (49:50) integer external )
The control file includes the following information:
Data file
The data.dat file in the current directory.
Log file
If the user does not enter the log file name at the command prompt, the file is created with the control file name by default. In this example, a file named control.log is created.
Error file
If the user does not enter the error file name at the command prompt, the file is created with the data file name by default. In this example, a file named data.bad is created. However, the error file is not created if there are no error logs.
Target table
The name, job, birth date, city, and age column data will be inserted into the member table.
FIELDS TERMINATED BY, FIELDS OPTIONALLY ENCLOSED BY, FIELDS ESCAPED BY, and LINES TERMINATED BY character strings
The aforementioned delimiter parameters are not used here because the data is in fixed-length record format. Since LINES FIX also cannot be used, the EOL character is used to terminate each line.
Loading data
Loads data starting from the first line of the data file.
Binds the last column to NULL when its value is missing in the data file.
The following are the data in the data.dat file. Because tbLoader reads in fixed-length records, the column data must be accurately positioned in the data file.
7777 KKS CHAIRMAN 1975-11-18 SEOUL 33 7839 KING PRESIDEN DAEGU 45 7934 MILLER CLERK 1967-01-24 BUSAN 37 7566 JONES MANAGER 7499 ALLEN SALESMAN ddddddddd KYUNG-JU aaaa7654 MARTIN SALESMAN 7648 CHAN ANALYST 1979-10-11 INCHON 28
Execute tbLoader as follows.
tbloader userid=loader/loader_pw@default control=./control.ctl
After executing tbLoader, check the log and error files.
The log file contains the logs generated while executing tbLoader.
tbLoader 6 TmaxData Corporation Copyright (c) 2008-. All rights reserved. Data File : ./data.dat Bad File : ./data.bad Table 'MEMBER' was loaded from the data file. Column Name Position DataType ------------------------------ ---------- ------------ ID 1 NUMERIC EXTERNAL NAME 2 CHARACTER JOB 3 CHARACTER BIRTHDATE 4 DATE CITY 5 CHARACTER AGE 6 NUMERIC EXTERNAL Record 2 was rejected. TBR-5074: Given string does not represent a number in proper format. Record 4 was rejected. TBR-5045: Only TIME format can be omitted. Table 'MEMBER' ---------------- 6 Rows were requested to load. 4 Rows were loaded successfully. 2 Rows were failed to load because of some errors Elapsed time was: 00:00:00.022404
The error file contains data that tbLoader failed to load. The failed data can be modified and reloaded into Tibero.
7839 KING PRESIDENT 1963-04-13 DAEGU 7566 JONES MANAGER 1955-08-20 53
Use the conventional path load method to load delimited record data into Tibero server.
The following is a sample control file that includes a BLOB column for large binary data. A CLOB column for large character data can be used in the same way.
The following is the control file, control.ctl.
LOAD DATA INFILE './data.dat' LOGFILE './logfile.log' BADFILE './badfile.bad' APPEND INTO TABLE member FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '\\' LINES TERMINATED BY '\n' TRAILING NULLCOLS IGNORE 2 LINES ( id integer external, name, job, birthdate, city, age integer external, picture outfile ) id, name, job, birthdate, city, age, picture outfile )
The control file includes the following information:
Data file
The data.dat file in the current directory.
Log file
If the user does not enter the log file name at the command prompt, the logfile.log file is created in the current directory.
Error file
If the user does not enter the error file name at the command prompt, the badfile.bad file is created in the current directory. However, the error file is not created if there are no error logs.
Target table
The name, job, birth date, city, and age column data will be inserted into the member table.
FIELDS TERMINATED BY, FIELDS OPTIONALLY ENCLOSED BY, FIELDS ESCAPED BY, and LINES TERMINATED BY character strings.
Use a comma (,) for FIELD TERMINATOR, double quotes (" ") for FIELDS OPTIONALLY ENCLOSED BY, and the newline character (|\n) for LINES TERMINATED BY.
Loading data
The first line of the data file is ignored due to the IGNORE LINES clause.
Binds the last column to NULL when its value is missing in the data file.
The following are the data in the data.dat file. Enter the path of the binary file to upload the binary data to BLOB column.
Ignores the first line. 7782,"Clark","Manager",1981-01-11 , DAEGU,26,./blob.jpg 7839,"King",President,1960/11/17,SEOUL,47 7934,"Miller","Clerk",1977/10/12,BUSAN,30,./blob2.jpg 7566,"Jones",Manager\, ,1981/04/02,31 7499, "Allen", "Salesman" a,1981:02/20,26 7654, "Martin", "Sale smn", 1981/10/28,26 7658, "Chan",Ana lyst, 1982/05/03,25,25
Execute tbLoader as follows:
tbloader userid=loader/loader_pw@default control=./control.ctl
After executing tbLoader, check the log and error files.
The log file contains the logs generated while executing tbLoader.
tbLoader 6 TmaxData Corporation Copyright (c) 2008-. All rights reserved. Data File : ./data.dat Bad File : ./badfile.bad Table 'MEMBER' was loaded from the data file. Column Name Position DataType ------------------------------ ---------- ------------ ID 1 NUMERIC EXTERNAL NAME 2 CHARACTER JOB 3 CHARACTER BIRTHDATE 4 DATE CITY 5 CHARACTER AGE 6 NUMERIC EXTERNAL PICTURE 7 RAW Record 4 was rejected. TBR-80025 : Column data exceeds data buffer size. Table 'MEMBER' ---------------- 6 Rows were requested to load. 5 Rows were loaded successfully. 1 Rows were failed to load because of some errors Elapsed time was: 00:00:00.055475
The error file contains data that tbLoader failed to load. The failed data can be modified and reloaded into Tibero.
7499, "Allen", "Salesman" a,1981:02/20,26