Chapter 3. Data Set Migration

Table of Contents

3.1. Overview
3.2. Non-VSAM Data Set Migration
3.2.1. Migration Target Identification and Source Extraction
3.2.2. Data Set Layout Analysis
3.2.3. Data Set Migration Schema Creation
3.2.4. Target Data Set Migration
3.2.5. PDS Migration
3.2.6. GDG BASE Definition
3.2.7. Variable-length Data Set Migration
3.2.8. Data Set Migration Verification
3.3. VSAM Data Set Migration
3.3.1. Defining VSAM Data Sets
3.3.2. Importing VSAM Data Sets
3.3.3. Writing Data Set Import Programs
3.4. Static and Dynamic Migration
3.4.1. Static Migration vs. Dynamic Migration
3.4.2. When Dynamic Migration is Required
3.4.3. When Dynamic Migration is Recommended
3.4.4. Dynamic Migration Usage
3.5. Data Set Migration Examples
3.5.1. Fixed-length Data Sets
3.5.2. Variable-length Data Sets
3.5.3. Data Sets with REDEFINES Statements
3.5.4. Data Sets Using Multiple Layouts

This chapter describes the process of migrating legacy system data sets.

This chapter briefly describes the processes for converting and migrating data sets from a legacy mainframe system to the OpenFrame system.

Data sets are collections of logically linked data records. A record is the basic unit of data used by applications and system processes. There are many different types of data sets, but they can be grouped into two broad categories: VSAM data sets and non-VSAM data sets.

Note

For more information about data sets, refer to OpenFrame Data Set Guide.

The method of converting data sets for migration from mainframe to OpenFrame varies with the data set type, as shown in the following figure.


The following are the general steps for migrating non-VSAM data sets to OpenFrame.

  1. Migration target identification and source extraction

  2. Data set layout analysis

  3. Data set migration schema creation

  4. Target data set migration

  5. PDS migration

  6. GDG BASE definition

  7. Variable-length data set migration

  8. Data set migration verification

Depending on the preference of the customer, data set layout information may be provided in variety of formats. However, not all formats are supported by OpenFrame; the layout information must first be converted into either standard COBOL copybook format or PL/I include format.

<XTBC106.cpy>

       01 I1.
           05  KYAKUMEI-KN           PIC X(0018).
           05  BTN-CD                PIC X(0004).
           05  KYAKU-NO              PIC X(0007).
           05  ATUKAI-CD             PIC X(0003).
           05  MDY-CD                PIC X(0003).
           05  YAKU-YMD              PIC S9(0009) COMP-3.
           05  UKEW-YMD              PIC S9(0009) COMP-3.
           05  KOYU-MEI-CD           PIC S9(0005) COMP-3.
           05  KAISU                 PIC S9(0005) COMP-3.
           05  GO .
               10  GO-1              PIC X(0001).
               10  GO-2              PIC X(0001).
               10  GO-3              PIC X(0001).

The cobgensch tool can be used to create the data set migration schema file from a COBOL copybook file <XTBC106.cpy>.

In the following example, the total record length is specified as 54 with the –r option. If a record length is specified, ensure that the total length of the fields in the COBOL copybook file matches the value specified.

$ cobgensch XTBC106.cpy –r 54

The following example shows the schema file generated by cobgensch from the <XTBC106.cpy> file.

<XTBC106.conv>

* Schema Version 7.1
L1, 01, I1, NULL, NULL, 0, 1:1,
L2, 05, KYAKUMEI-KN, EBC_ASC, NULL, 18, 1:1,
L3, 05, BTN-CD, EBC_ASC, NULL, 4, 1:1,
L4, 05, KYAKU-NO, EBC_ASC, NULL, 7, 1:1,
L5, 05, ATUKAI-CD, EBC_ASC, NULL, 3, 1:1,
L6, 05, MDY-CD, EBC_ASC, NULL, 3, 1:1,
L7, 05, YAKU-YMD, PACKED, NULL, 5, 1:1,
L8, 05, UKEW-YMD, PACKED, NULL, 5, 1:1,
L9, 05, KOYU-MEI-CD, PACKED, NULL, 3, 1:1,
L10, 05, KAISU, PACKED, NULL, 3, 1:1,
L11, 05, GO, NULL, NULL, 0, 1:1,
L12, 10, GO-1, EBC_ASC, NULL, 1, 1:1,
L13, 10, GO-2, EBC_ASC, NULL, 1, 1:1,
L14, 10, GO-3, EBC_ASC, NULL, 1, 1:1,

* Condition
L0, "\0", ( L1 L2 L3 L4 L5 L6 L7 L8 L9 L10 L11 L12 L13 L14 )

Each line in the copybook file containing a PIC statement is converted into data set conversion schema format. The PIC X statement specifies that the value of that field will be converted from EBCDIC to ASCII format; the COMP-3 statement after PIC S9 indicates a field in PACKED format. In addition to these formats, the ZONED type (zoned decimal field) and the GRAPHIC type (2-byte character field) can be used in schema files.

Instead of COBOL copybook files, PL/I include-type files such as the following can be used.

<XTBC107.inc>

     3  ID CHAR(02),
     3  NAME CHAR(05),
     3  CODE,
       5  CODE_NUM    PIC'(03)9',
       5  CODE_DAT    PIC'(03)X',
     3  ETC FIXED DEC (3, 0) ;

If a PL/I include file, <XTBC107.inc>, is used, the data set conversion schema file must be created with pligensch.

The following command uses the -s option. This option must be specified when converting only a part of the PL/I include file.

$ pligensch -s XTBC107.inc

The following is the data set conversion schema file generated from <XTBC107.inc>. The CHAR, PIC 9, and FIXED DEC statements are converted by pligensch to EBC_ASC, ZONED, and PACKED data types respectively.

<XTBC107.conv>

* Schema Version 7.1
L1, 01, ID_STRUCTURE, NULL, NULL, 0, 1:1,
L2, 03, ID, EBC_ASC, NULL, 2, 1:1,
L3, 03, NAME, EBC_ASC, NULL, 5, 1:1,
L4, 03, CODE, NULL, NULL, 0, 1:1,
L5, 05, CODE_NUM, ZONED, NULL, 3, 1:1,
L6, 05, CODE_DAT, EBC_ASC, NULL, 3, 1:1,
L7, 03, ETC, PACKED, NULL, 2, 1:1,

* Condition
L0, "\0", ( L1 L2 L3 L4 L5 L6 L7 )

The migration process for variable-length data sets is the same as that for regular data sets. However, analyzing the data set layout and generating the migration schema can be more complicated.

For variable-length data sets, the source data set from mainframe is identical in format to the following sequential file. Each record is composed of a 4-byte Record Descriptor Word (RDW) followed by data. The first two bytes of RDW describe the length of the record (including the 4 bytes from the RDW) and the remaining 2 bytes are filled with zeros. By using the record information of RDW, you can easily determine the start and end of a record.


As with fixed-length data sets, you can organize the migration schema for variable-length data sets into a COBOL copybook format and create the migration schema by using the cobgensch tool. However, depending on the project, you can create a migration schema directly as shown in the following example, without having to reorganize it into the COBOL copybook format.

* Schema Version 7.1
L1, 01, V-REC, NULL, NULL, 0, 1:1,
L2, 05, REC-KEY, EBC_ASC, NULL, 8, 1:1,
L3, 05, ODO-LEN, COPY, NULL, 2, 1:1,
L4, 05, ODO-FLD, NULL, NULL, 0, 0:4000, ODO-LEN
L5, 07, ODO-DAT, EBC_ASC, NULL, 1, 1:1,

* Condition
L0, "\0", ( L1 L2 L3 L4 L5 )

The schema file ODO-FLD is an example of a variable-length data set whose length varies according to the value of the ODO-LEN field. The number of repetitions of ODO-FLD and the data set length are determined by reading the ODO-LEN field value of the record. After the source data set and the migration schema files are prepared, the actual migration can be performed by using the dsmigin tool in the same way as a fixed-length dataset.

The following example creates the data set <TEST.VAR> with the dsmigin tool. <TEST.VAR.raw> is the source file and <TEST.VAR.conv> is the migration schema file:

$ dsmigin TEST.VAR.raw TEST.VAR –s TEST.VAR.conv –e JP -f VB

The following are the general steps for migrating VSAM data sets to OpenFrame.

  1. Create an empty VSAM data set by using the DEFINE CLUSTER command of the idcams tool.

  2. Import data to the VSAM data set using the RECATALOG option of the dsmigin tool.

For data sets that cannot be migrated using this procedure, such as an RRDS with missing record numbers, you must write a separate data set import program for migration.

Data sets can be migrated using two methods: static migration and dynamic migration. You can determine which method was used by looking at the data set schema.

This section describes the migration sequence for three types of data sets: fixed-length data sets, variable-length data sets, and data sets that use the REDEFINES statement.

For data sets with copybook files that contain REDEFINES statements, information must be provided about the conditions that will trigger the redefinition.

01 A.
       05 B     PIC X(01).
       05 C     PIC X(05).
       05 D     REDEFINES   C.
       10 E     PIC X(02).
       10 F     PIC X(03).

In the above example, assume that D field should only be redefined as C if the value of the B field is "A". All other cases should reference the C field. To achieve this, a $$COND statement must be added to the copybook file.

The following is the syntax of a $$COND statement.

$$COND : NAME_01 : VALUE_01 [: NAME_02 : VALUE_02 … ] : REDEFINE_NAME [...]
  • $$COND

    • Label that specifies the start of the condition statement.

  • NAME_01 : VALUE_01

    • A field name and condition value pair separated with a colon (:). When using the dsmigin tool, the value specified in VALUE_# will be converted automatically to the NAME_# field type. This means that you can enter an ASCII value enclosed with double quotes without knowing the actual field type, such as EBCDIC, ZONED, or PACKED.

    • Multiple entries of 'NAME_# : VALUE_#' pair are also separated with a colon (:).

  • REDEFINE_NAME

    • The redefined field to use when the $$COND statement is met. Multiple field names can be specified.

    • The type of a value in the condition statement can be converted to another type by explicitly specifying the desired type.

    • The following field types can be specified.

      TypeDescription

      ! (exclamation mark)

      Checks NOT EQUALITY.

      '(empty)'

      Returns the default type of the field.

      'A'

      Uses the ASCII character of the value.

      'E'

      Converts to EBCDIC character.

      'P' or 'UP'

      Converts to Packed Decimal or Unsigned Packed Decimal type.

      'Z' or 'UZ'

      Converts to Zoned Decimal or Unsigned Zoned Decimal type.

      'G'

      Converts to GRAPHIC (2Byte) character.

      'H'

      Coverts the value to a Hexadecimal value. (Example: H"123" → 0x7B)

      'X'

      Converts the 2 byte value to a 1 byte Hex value. (Example: X"12AB" → 0x12 0xAB)

      'T'

      Checks the field type. Only Zoned, Packed and National are supported. (Example: T"ZONED"/T"PACKED"/T"NATIONAL" → Check whether the field is Zoned/Packed Decimal/National Character.)

    • To specify a packed or zoned decimal type, you must decide whether it is a signed or unsigned decimal. For example, if the hexadecimal value of the last digit of a packed decimal field is '0x0C' (positive value) or '0x0D' (negative value), then you must use the signed packed decimal type; if the last digit is '0x0F', then you must use the unsigned packed decimal type instead.

      If the hexadecimal value of the last digit of a zoned decimal field is '0xC0' (positive value) or '0xD0' (negative value), then you must use the signed zoned decimal type; if the last digit is '0xF0', you must use the unsigned zoned decimal type.

The following example uses a $$COND conditional statement in a copybook with a REDEFINES statement.

01 A.
       05 B     PIC X(01).
       05 C     PIC X(05).
       05 D     REDEFINES   C.
       10 E     PIC X(02).
       10 F     PIC X(03).
$$COND : B : "A" : D

The cobgensch tool references the $$COND statement to generate a migration schema that will be used by dsmigin.

The following is an example of a migration schema generated by cobgensch.

* Schema Version 7.1
L1, 01, A, NULL, NULL, 0, 1:1,
L2, 05, B, EBC_ASC, NULL, 1, 1:1,
L3, 05, C, EBC_ASC, NULL, 5, 1:1,
L4, 05, D, NULL, NULL, 0, 1:1,  # REDEFINES C
L5, 10, E, EBC_ASC, NULL, 2, 1:1,
L6, 10, F, EBC_ASC, NULL, 3, 1:1,

* Condition
L2, "A", ( L1 L2 L4 L5 L6 )
L0, "\0", ( L1 L2 L3 )

The conditions specified in the migration schema is used during migration.

The statement beginning with the label 'L0' is the default condition statement.

Data sets using multiple layouts are specified with conditional statements using constructs similar to the previous section.

01 AAA-A.
  03 BBB-A.
    05 DATA-1   PIC  X(01).
    05 DATA-2   PIC  X(04).
    05 DATA-A   REDEFINES DATA-2.
      07 DATA-A1   PIC  9(02).
      07 DATA-A2   PIC  X(02).
01 AAA-B.
  03 BBB-B.
    05 DATA-3   PIC  9(01).
    05 DATA-4   PIC  9(04).
    05 DATA-D   REDEFINES DATA-4.
      07 DATA-D1   PIC  9(01).
      07 DATA-D2   PIC  X(03).

When there are multiple level 01 items as in the previous example, you must specify a $$COND statement, as for REDEFINES statement, to use a layout other than the first item.

A $$COND statement to use a specific layout can be written as follows:

$$COND : DATA-3 : "B" : AAA-B

The following is an example of a $$COND statement that specifies the REDEFINES statement of the second layout.

$$COND : DATA-3 : "D" : AAA-B DATA-D

The following is an example of a schema file created by using the cobgensch tool:

* Schema Version 7.1
L1, 01, AAA-A, NULL, NULL, 0, 1:1,
L2, 03, BBB-A, NULL, NULL, 0, 1:1,
L3, 05, DATA-1, EBC_ASC, NULL, 1, 1:1,
L4, 05, DATA-2, EBC_ASC, NULL, 4, 1:1,
L5, 05, DATA-A, NULL, NULL, 0, 1:1,  # REDEFINES DATA-2
L6, 07, DATA-A1, U_ZONED, NULL, 2, 1:1,
L7, 07, DATA-A2, EBC_ASC, NULL, 2, 1:1,

L8, 01, AAA-B, NULL, NULL, 0, 1:1,
L9, 03, BBB-B, NULL, NULL, 0, 1:1,
L10, 05, DATA-3, U_ZONED, NULL, 1, 1:1,
L11, 05, DATA-4, U_ZONED, NULL, 4, 1:1,
L12, 05, DATA-D, NULL, NULL, 0, 1:1,  # REDEFINES DATA-4
L13, 07, DATA-D1, U_ZONED, NULL, 1, 1:1,
L14, 07, DATA-D2, EBC_ASC, NULL, 3, 1:1,

* Condition
L10, "B", ( L8 L9 L10 L11 )
L10, "D", ( L8 L9 L10 L12 L13 L14 )
L0, "\0", ( L1 L2 L3 L4 )

The schema file includes two layouts with level 01. All conditions, except for the default condition, specified using $$COND start from the AAA-B field, which is level 01 of the second layout.

The schema file allows you to apply a conversion rule based on the value of the specified field, DATA-3 in the previous example, to migrate data sets that use multiple layouts.