HSORT Documentation – Heirloom Computing

HSORT Description & Purpose

HSORT is a program that works with records by performing various kinds of operations on them such as: filtering, changing, summing, ordering, skipping records, choosing a subset of the records. Its main purpose is to provide an efficient way of manipulating information to enable customers to extract business-relevant data.

Prerequisites

An elastic_cobol license is needed in order to use the HSORT product.

In order to execute "EXEC SORT" statements with HSORT instead of the default linux sort, the following property needs to be added to ebp.properties:

ebp.alias.SORT=HSORT

DCB dependency

HSORT requires DCB entries to get needed I/O information. There is support for both File DCB and Table DCB.

HSORT Error Handling

Syntax error check:

- All statements that are used should be allowed.

- Check for correctness of SORT FIELD, INCLUDE/OMIT, INREC/OUTREC, SUM statements.

- Record length and record type should be specified.

Record validity check for numeric fields in PD and ZD formats.
When SUM statement is used there is a check for summation overflow.

HSORT input and output

The input/output for HSORT could be one of the following:

physical file on disk
file in VSAM format (a DB table)
file in Record-descriptor-word (RDW) format
input from SYSIPT
output to PUNCH

based on what is specified in the JCL and the DCB. Both resources are expected to match and JCL will override the information from DCB in case of a mismatch in the record's length/format.

We differentiate between the file formats based on the PROTOTYPE property that is set in the DCB:

"VDB:" -> file from database (DB table), "SYNC:" -> physical file in RDW format

HSORT Functionalities

At the moment, HSORT has support for the following JCL statements:

SORT FIELDS=(<fields>)

SORT FIELDS=(1,3,CH,A,4,2,PD,D,7,9,ZD,A)

```
SORT FIELDS=COPY
```
```
MERGE FIELD=COPY
```
INREC FIELDS=(<fields>)

INREC FIELDS=(3,3,X,6,6,2X,12,4,1C'END',2X'4C',5X)

OUTREC FIELDS=(<fields>)

 OUTREC FIELDS=(3,3,1C'A',10,4, 4X, 6,4,6,4,X'40404040')

SUM FIELDS=(<fields>)

SUM FIELDS=(94,5,PD,121,5,ZD)

```
SUM FIELDS=NONE
```
INCLUDE COND=(<conditions>)

INCLUDE COND=(51,3,CH,EQ,C'972',OR,51,3,CH,EQ,C'763')

OMIT COND=(<conditions>)

OMIT COND=(211,5,PD,EQ,0,AND,97,5,PD,EQ,0,AND,102,5,PD,EQ,0)

```
INPFIL SYSIPT
```
```
OUTFIL PUNCH
```
OUTFIL FILES=<number>, <statements>

OUTFIL FILES=1, INCLUDE=(106,15,CH,EQ,C'OSLO')

OPTION SKIPREC=<number>, NRECS=<number
```
OPTION SKIPREC=10,NRECS=15
```

HSORT Statistics

HSORT print statistics for the following:

Number of records read and bytes read
Number Records written and bytes written
Input and output record length

Logging

Logging can be set on through the following constant in ebp.properties:

hsortlog=(true | false)

EBCDIC collation

As of today, HSORT has support for 2 EBCDIC codepages - 037 and 277.

The collation sequences used, match exactly the collation sequences from MSSQL:

SQL_EBCDIC037_CP1_CS_AS
SQL_EBCDIC277_CP1_CS_AS

In order to specify the codepage, you should add ebp property:

hsort.ebcdic.codepage=<codepage-number>

<codepage-number> should be either 037 or 277.

If something else is provided HSORT will default to codepage 037 (SQL_EBCDIC037_CP1_CS_AS).

Note that there is a slight difference between Codepage037 and SQL_EBCDIC037_CP1_CS_AS.

This is true for codepage 277 as well.

HSORT Parameters

HSORT Parameters in ebp.properties

There are five parameters defined in ebp.properties which can be used to fine tune HSORT performance.

ebp.hsort.bufferSize=8192
ebp.hsort.maxItemsPerFile=100000
ebp.hsort.initialSortInParallel=no
ebp.hsort.maxFilesPerMerge=100
ebp.hsort.tempdir=
ebp.hsort.defaultMaxMemory=

ebp.hsort.bufferSize parameter is used for setting the buffer size for the sort operation. If this parameter is not set, 8192 is used as the default value.
ebp.hsort.maxItemsPerFile parameter is used for setting the maximum number of records in each temporary file to be sorted. If this parameter is not set, 100000 is used as the default value.
ebp.hsort.initialSortInParallel parameter is used for enabling parallel sorting in files. Increasing the value of the parameter ebp.hsort.maxItemsPerFile might increase the effectiveness of this feature. If this parameter is not set to yes or true, the feature is disabled as default.
ebp.hsort.maxFilesPerMerge parameter is used for setting the maximum number of temporary files to be created for splitting the records to be sorted. If this parameter is not set, 100 is used as the default value.
ebp.hsort.tempdir parameter is used for setting the directory which the temporary files are created. If this parameter is not set, default temporary folder of the underlying operating system is used.
ebp.hsort.defaultMaxMemory parameter is used for setting the default amount of memory used by HSORT for all jobs. This value is overridden if the STEPMEM step parameter is set in JCS type jobs. K, M or G can be used as unit symbols (1024K, 1024M, 1G, etc.).

hsort.properties file

This file is used to determine which hexadecimal values to be kept in hexadecimal format according to job, procedure and step names. Also, ASCII sort order can be applied to specific sort steps with a similar format using this file. It should be placed in the same directory as the ebp.properties file. This file is not required for HSORT to run.

Usage

Parameter	Description	Units, Default
ebp.hsort.converthex.<jobname>.<stepname>	Hexadecimal value conversion for a given step of a job. Should be all lowercase	true \| false. Default true
ebp.hsort.converthex.<jobname>.<procname>.<stepname>	Hexadecimal value conversion for a given step of a job where the HSORT step is contained within a procedure. Should be all lowercase	true \| false. Default true
ebp.hsort.nullpadding.<jobname>.<stepname>	Set the sort to use NULL instead of spaces for padding the result to expected length for a given step of a job. Should be all lowercase	true \| false. Default false
ebp.hsort.nullpadding	Set the sort to use NULL instead of spaces for padding the result to expected length. This is global setting and will be applied to all HSORT steps	true \| false. Default false
ebp.hsort.asciiorder.<jobname>.<stepname>	Set the sort order to ASCII instead of EBCDIC for a given step of a job. Should be all lowercase	true \| false. Default false
ebp.hsort.asciiorder.<jobname>.<procname>.<stepname>	Set the sort order to ASCII instead of EBCDIC for a given step of a job where the HSORT step is contained within a procedure. Should be all lowercase	true \| false. Default false
ebp.hsort.statistics.reversedOrder	Order of HSORT record written statistics, the straight order is the one follows the order of the dd file names defined in the JCL. The reversed order is the same order as implemented on the mainframe.	true \| false, Default false
ebp.hsort.statistics.useSORTOFddnames	Determines if we should use the DDName or the output file name to identify where the records are written	true \| false, Default false
ebp.hsort.postgres.addbinarylengthtovarchar	Determines if we should add 2 byte binary length to a vachar field	true \| false, Default false

Links to useful resources about commands usage, syntax, etc.:

https://drive.google.com/file/d/1sDQ9-o1HybzdTfrf00yF_as9dHB2JCic/view

https://drive.google.com/file/d/1T_k0KLAlEIl7CFnPQwtHCyBZQKVAbAzy/view

https://www.ibm.com/docs/en/zos/2.1.0?topic=statements-option-control-statement