HSORT Description & Purpose
HSORT is a program that works with records by performing various kinds of operations on them such as: filtering, changing, summing, ordering, skipping records, choosing a subset of the records. Its main purpose is to provide an efficient way of manipulating information to enable customers to extract business-relevant data.
Prerequisites
An elastic_cobol license is needed in order to use the HSORT product.
In order to execute "EXEC SORT" statements with HSORT instead of the default linux sort, the following property needs to be added to ebp.properties:
ebp.alias.SORT=HSORT
DCB dependency
HSORT requires DCB entries to get needed I/O information. There is support for both File DCB and Table DCB.
HSORT Error Handling
- Syntax error check:
- All statements that are used should be allowed.
- Check for correctness of SORT FIELD, INCLUDE/OMIT, INREC/OUTREC, SUM statements.
- Record length and record type should be specified.
- Record validity check for numeric fields in PD and ZD formats.
- When SUM statement is used there is a check for summation overflow.
HSORT input and output
The input/output for HSORT could be one of the following:
- physical file on disk
- file in VSAM format (a DB table)
- file in Record-descriptor-word (RDW) format
- input from SYSIPT
- output to PUNCH
based on what is specified in the JCL and the DCB. Both resources are expected to match and JCL will override the information from DCB in case of a mismatch in the record's length/format.
We differentiate between the file formats based on the PROTOTYPE property that is set in the DCB:
"VDB:" -> file from database (DB table), "SYNC:" -> physical file in RDW format
HSORT Functionalities
At the moment, HSORT has support for the following JCL statements:
- SORT FIELDS=(<fields>)
SORT FIELDS=(1,3,CH,A,4,2,PD,D,7,9,ZD,A)
-
SORT FIELDS=COPY
-
MERGE FIELD=COPY
- INREC FIELDS=(<fields>)
INREC FIELDS=(3,3,X,6,6,2X,12,4,1C'END',2X'4C',5X)
- OUTREC FIELDS=(<fields>)
OUTREC FIELDS=(3,3,1C'A',10,4, 4X, 6,4,6,4,X'40404040')
- SUM FIELDS=(<fields>)
SUM FIELDS=(94,5,PD,121,5,ZD)
-
SUM FIELDS=NONE
- INCLUDE COND=(<conditions>)
INCLUDE COND=(51,3,CH,EQ,C'972',OR,51,3,CH,EQ,C'763')
- OMIT COND=(<conditions>)
OMIT COND=(211,5,PD,EQ,0,AND,97,5,PD,EQ,0,AND,102,5,PD,EQ,0)
-
INPFIL SYSIPT
-
OUTFIL PUNCH
- OUTFIL FILES=<number>, <statements>
OUTFIL FILES=1, INCLUDE=(106,15,CH,EQ,C'OSLO')
- OPTION SKIPREC=<number>, NRECS=<number
-
OPTION SKIPREC=10,NRECS=15
HSORT Statistics
HSORT print statistics for the following:
- Number of records read and bytes read
- Number Records written and bytes written
- Input and output record length
Logging
Logging can be set on through the following constant in ebp.properties:
hsortlog=(true | false)
EBCDIC collation
As of today, HSORT has support for 2 EBCDIC codepages - 037 and 277.
The collation sequences used, match exactly the collation sequences from MSSQL:
- SQL_EBCDIC037_CP1_CS_AS
- SQL_EBCDIC277_CP1_CS_AS
In order to specify the codepage, you should add ebp property:
hsort.ebcdic.codepage=<codepage-number>
<codepage-number> should be either 037 or 277.
If something else is provided HSORT will default to codepage 037 (SQL_EBCDIC037_CP1_CS_AS).
Note that there is a slight difference between Codepage037 and SQL_EBCDIC037_CP1_CS_AS.
This is true for codepage 277 as well.
HSORT Parameters
HSORT Parameters in ebp.properties
There are five parameters defined in ebp.properties which can be used to fine tune HSORT performance.
ebp.hsort.bufferSize=8192
ebp.hsort.maxItemsPerFile=100000
ebp.hsort.initialSortInParallel=no
ebp.hsort.maxFilesPerMerge=100
ebp.hsort.tempdir=
ebp.hsort.defaultMaxMemory=
- ebp.hsort.bufferSize parameter is used for setting the buffer size for the sort operation. If this parameter is not set, 8192 is used as the default value.
- ebp.hsort.maxItemsPerFile parameter is used for setting the maximum number of records in each temporary file to be sorted. If this parameter is not set, 100000 is used as the default value.
- ebp.hsort.initialSortInParallel parameter is used for enabling parallel sorting in files. Increasing the value of the parameter ebp.hsort.maxItemsPerFile might increase the effectiveness of this feature. If this parameter is not set to yes or true, the feature is disabled as default.
- ebp.hsort.maxFilesPerMerge parameter is used for setting the maximum number of temporary files to be created for splitting the records to be sorted. If this parameter is not set, 100 is used as the default value.
-
ebp.hsort.tempdir parameter is used for setting the directory which the temporary files are created. If this parameter is not set, default temporary folder of the underlying operating system is used.
- ebp.hsort.defaultMaxMemory parameter is used for setting the default amount of memory used by HSORT for all jobs. This value is overridden if the STEPMEM step parameter is set in JCS type jobs. K, M or G can be used as unit symbols (1024K, 1024M, 1G, etc.).
hsort.properties file
This file is used to determine which hexadecimal values to be kept in hexadecimal format according to job, procedure and step names. Also, ASCII sort order can be applied to specific sort steps with a similar format using this file. It should be placed in the same directory as the ebp.properties file. This file is not required for HSORT to run.
Usage
Parameter | Description | Units, Default |
ebp.hsort.converthex.<jobname>.<stepname> | Hexadecimal value conversion for a given step of a job. Should be all lowercase | true | false. Default true |
ebp.hsort.converthex.<jobname>.<procname>.<stepname> | Hexadecimal value conversion for a given step of a job where the HSORT step is contained within a procedure. Should be all lowercase | true | false. Default true |
ebp.hsort.asciiorder.<jobname>.<stepname> | Set the sort order to ASCII instead of EBCDIC for a given step of a job. Should be all lowercase | true | false. Default false |
ebp.hsort.asciiorder.<jobname>.<procname>.<stepname> | Set the sort order to ASCII instead of EBCDIC for a given step of a job where the HSORT step is contained within a procedure. Should be all lowercase | true | false. Default false |
ebp.hsort.statistics.reversedOrder | Order of HSORT record written statistics, the straight order is the one follows the order of the dd file names defined in the JCL. The reversed order is the same order as implemented on the mainframe. | true | false, Default false |
ebp.hsort.statistics.useSORTOFddnames | Determines if we should use the DDName or the output file name to identify where the records are written | true | false, Default false |
ebp.hsort.postgres.addbinarylengthtovarchar | Determines if we should add 2 byte binary length to a vachar field | true | false, Default false |
Links to useful resources about commands usage, syntax, etc.:
https://drive.google.com/file/d/1sDQ9-o1HybzdTfrf00yF_as9dHB2JCic/view
https://drive.google.com/file/d/1T_k0KLAlEIl7CFnPQwtHCyBZQKVAbAzy/view
https://www.ibm.com/docs/en/zos/2.1.0?topic=statements-option-control-statement
0 Comments