Dataflows for Apache NiFi
Introduction
Apache NiFi is a dataflow system for the stream and batch processing of data. Dataflows are configured in the NiFi web GUI to perform the following tasks:
- EDR parsing from text-based EDRs into JSON based EDRs.
- Storage of EDRs as parsed into the
n2reporting
database - Service determination, to split call control and provisioning EDRs into the correct processing paths.
- Aggregation of EDR data into summarised forms for reporting.
- Database extracts (of the N2ACD database) for integrated EDR + service data reports.
This N2ACD NiFi guide will provide step-by-step instructions for the configuration of N2ACD dataflows, however for comprehensive details on how to use Apache NiFi, see the NiFi user guide.
A description of each NiFi process group is provided first. For installation instructions and how to configure, see below.
The “N2SVCD EDR Parsing” Process Group
The N2SVCD EDR Parsing process group is responsible for parsing N2SVCD text based EDR files into individual EDR records and then determining the relevant service each EDR belongs to.
This process group will:
- Read EDR files from a configured EDR directory. Note EDRs read from this directory are deleted as they are read in to NiFi. Backups of EDR files must be taken before being placed in the NiFi input directory unless the NiFi processing itself is altered to store out the processed file again.
- Convert each EDR in each file read. EDRs are converted in to JSON format for subsequent processing, with key fields (including the session ID and EDR event timestamp) extracted.
- Determine the relevant logical service each EDR belongs to. For some EDRs this is determined directly from the EDR itself. Other EDRs must be correlated with an initial EDR (e.g. an INITIALDP or SIP INVITE).
Subsequent actions (such as storing EDRs, summarising EDRs or generating errors or statistics) are performed by subsequent connected process groups.
The output ports on this process group may differ, depending on the specific N-Squared products installed and how the service determination is configured.
Configuration
The following configuration is required for the N2SVCD EDR Parsing process group:
Parameter | Default Value | Purpose |
---|---|---|
N2SVCD_EDR_ERROR_DIR |
/opt/nifi/edr/error |
If an error occurs while reading an EDR file as a whole (before processing of individual EDRs could be done), the EDR file will be written back to this directory. |
N2SVCD_EDR_INPUT_DIR |
/opt/nifi/edr/input |
The directory on the reporting server from where EDR files can be read. EDR files should be moved into this directory (rather than written) such that the move is an atomic filesystem operation. |
N2SVCD_EDR_READER_SCRIPT |
/usr/share/n2rep/etc/nifi/n2svcd_reader.groovy |
The location of the Groovy script for parsing N2SVCD EDRs. |
N2REPORTING_PG_DB_DRIVER |
/opt/nsquared/ocs/lib/postgresql-42.3.1.jar |
The location of the Java jar file for PosgreSQL JDBC connectivity. |
N2REPORTING_PG_DB_URL |
jdbc:postgresql://127.0.0.1/n2reporting |
The full JDBC URL for the PostgreSQL n2reporting database. |
N2REPORTING_PG_USERNAME |
n2reporting_writer |
The username to connect to the reporting database with. |
N2REPORTING_PG_PASSWORD |
n2reporting_writer |
The database password to connect to the reporting database with. |
The “N2ACD Service EDR Processing” Process Group
This process group will store raw EDRs identified as part of the ACD service
into the n2acd.raw_json_edr
database table. It will aggregate EDRs received
such that one row per voice call is stored in the database table n2ad.summarised_edr
.
See the reporting db node installation instructions for creating the reporting database, and the reporting database data model for details on the reporting tables themselves.
This process group relies on previous service determination. EDRs not belonging to calls processed by the N2ACD service are not expected to be processed by this process group.
This process group will:
- Store EDRs to the N2ACD reporting database
raw_json_edr
database table. - Process EDRs to determine the purpose of the EDR and update the CDR record for the ACD call in the
summarised_edr
table.
Configuration
The following configuration is required for the N2ACD Service EDR Processing process group:
Parameter | Default Value | Purpose |
---|---|---|
N2ACD_EDR_READER_SCRIPT |
/usr/share/n2acd/etc/nifi/n2acd_reader.groovy |
The location of the Groovy script for additional N2ACD specific parsing of EDR data. |
N2REPORTING_PG_DB_DRIVER |
/opt/nsquared/ocs/lib/postgresql-42.3.1.jar |
The location of the Java jar file for PosgreSQL JDBC connectivity. |
N2REPORTING_PG_DB_URL |
jdbc:postgresql://127.0.0.1/n2reporting |
The full JDBC URL for the PostgreSQL n2reporting database. |
N2REPORTING_PG_USERNAME |
n2reporting_writer |
The username to connect to the reporting database with. |
N2REPORTING_PG_PASSWORD |
n2reporting_writer |
The database password to connect to the reporting database with. |
The “N2ACD DB Extract” Process Group
This process group extracts source data from the N2ACD service database into
database tables stored in the n2acd
schema. Each extract is timestamped, and
multiple extracts will be stored (based on storage capacity and partitioning
configuration)
See the reporting db node installation instructions for creating the reporting database, and the reporting database data model for details on the reporting tables themselves.
This process group will:
- On a regular basis copy customer, service and flow data from the service database to the reporting databse.
Configuration
The following configuration is required for the N2ACD DB Extract process group:
Parameter | Default Value | Purpose |
---|---|---|
N2REPORTING_PG_DB_DRIVER |
/opt/nsquared/ocs/lib/postgresql-42.3.1.jar |
The location of the Java jar file for PosgreSQL JDBC connectivity. |
N2REPORTING_PG_DB_URL |
jdbc:postgresql://127.0.0.1/n2reporting |
The full JDBC URL for the PostgreSQL n2reporting database. |
N2REPORTING_PG_USERNAME |
n2reporting_writer |
The username to connect to the reporting database with. |
N2REPORTING_PG_PASSWORD |
n2reporting_writer |
The database password to connect to the reporting database with. |
N2ACD_SERVICE_DB_URL |
jdbc:postgresql://n2-p-acd-sms-01/n2in |
The full JDBC URL for the PostgreSQL n2in database with the n2acd service database schema. |
N2ACD_SERVICE_DB_PG_USERNAME |
n2acd_owner |
The username to connect to the N2ACD SMS service database with. |
N2ACD_SERVICE_DB_PG_PASSWORD |
n2acd_owner |
The database password to connect to the N2ACD SMS service database with. |
NiFi Dataflow Installation
NiFI dataflow installation and configuration requires two steps:
- The import of process groups as templates, and then the creation of process groups from those templates.
- The installation-specific configuration of those process groups.
Importing Process Group Templates
Each process group template file is installed with the N2ACD SMS API
package and can be found in the /usr/share/n2rep/etc/nifi
and
/usr/share/n2acd/etc/nifi
directory on the reporting server.
Import these process groups as NiFi templates, then create process groups from them.
File | Description |
---|---|
N2SVCD_EDR_Parsing.xml |
A template for the N2SVCD EDR Parsing process group. |
N2ACD_Service_EDR_Processing.xml |
A template for the N2ACD Service EDR Processing process group. |
N2ACD_DB_Extract.xml |
A template for the N2ACD DB Extract process group |
To upload a template:
- Right click on the canvas in the NiFI GUI and select
Upload template
. - Select the template to upload from the local drive. This will be an
xml
file. - Upload the template by accepting the selected file. If the template is uniquely named, the template will appear in the templates list.
Then to use a template:
- Using the
Template
Option from the header in NiFi, select the template to create a process group from by dragging the icon onto the canvas. - Edit the new process group (right click the process group and select
Configure
) and in theGeneral
tab, configure theProcess Group Parameter Context
to be the process group parameter context (see below). - If required, configure the password for any controller services that access the database.
- Enable each “Controller Services” service in the configuration for each progress group. Each controller needs to be enabled even if the controller requires no site-specific configuration.
- Enable the process group (right click and select
Start
).
Configuring Installation Specific Parameters
Configuration for the dataflow templates provided by N-Squared for NiFi is done through the NiFi “Parameter Contexts” feature. To access the parameter contexts, use the burger bar in the top-right of the NiFi header:
In this menu, create or edit a parameter context, creating the parameters listed for each process group, as listed in this configuration manual:
The parameter context group must be named. The name can be unique to your installation.
Once the parameter context is created, configure each of the process groups by right clicking on the process group and configuring the parameter context for the process group:
Note that this must be done for each process group individually. Process groups do not inherit their parent parameter context group.
Configuring Passwords
Passwords are considered sensitive information in NiFi and are not stored in templates - even when parameters are being used. Edit the following three services in the imported NiFi templates and set the passwords for each database connection.
Note that the password can be set to the parameter by using the parameter formatted
value (#{N2ACD_SERVICE_DB_PG_PASSWORD}
or #{N2REPORTING_PG_PASSWORD}
), or
can be set directly to the password.
Passwords are required for:
- The reporting database. In the
N2SVCD EDR Parsing
Process Group, in then2reporting
controller service, in the fieldPassword
. - The reporting database. In the
N2ACD Service EDR Processing
Process Group, in then2reporting
controller service, in the fieldPassword
. - The reporting database. In the
N2ACD DB Extract
Process Group, in then2reporting
controller service, in the fieldPassword
. - The ACD SMS service database. In the
N2ACD DB Extract
Process Group, in theN2ACD Service Database
controller service, in the fieldPassword
.