Qualitis, recommended in this issue, is a data quality management platform that enables quality verification, notification, and management of a variety of data sources. It is used to solve various data quality problems caused by data processing.
Project Overview
Qualitis submits a quality model task to the Linkis platform based on Spring Boot. Provide data quality model construction, data quality model execution, data quality verification, data quality report and other functions.
At the same time, Qualitis provides enterprise-class features for financial level resource isolation, management, and access control. It also ensures that it can work properly in high concurrency, high performance, and high availability scenarios.
Feature
- Define the data quality model
The following data quality models are supported: 1. Single table model. 2. Multi-table model. 3. Customize the model. At the same time, Qualitis presets multiple data quality verification templates, including null check, blank check, number check, enumeration check and other common checks, which simplifies the definition of data quality model. - data quality model scheduling
Supports data quality model scheduling. - data Quality report
Supports data quality reporting. - Log management Supports the management of data quality tasks.
- Abnormal data management
Supports abnormal data storage to quickly locate faults.
System architecture diagram
Rapid deployment
1. Basic software installation
Gradle (4.9)
MySQL (5.5+)
JDK (1.8.0_141)
the Hadoop (2.7.2) class = “sysbr” / > Hive (1.2.1) < br class = “sysbr” / > Zookeeper (3.4.9) < br class = “sysbr” / > Linkis (0.9.1), The Spark engine is required.
2. Download
Address: https://github.com/WeBankFinTech/Qualitis/releases
3. Compile
gradle clean distZip
Installation
Decompress
zip
unzip qualitis-{version}.zip
tar
tar -zxvf qualitis-{VERSION}.tar.gz
< Connect to MySQL and initialize data
mysql -u {USERNAME} -p {PASSWORD} -h {IP} --default-character-set=utf8
source conf/database/init.sql
Modify the configuration
vim conf/ application-dev.yml
Modify the following configuration:
## Database configuration
spring.datasource.username=
spring.datasource.password=
spring.datasource.url=
## Database configuration, ditto
task.persistence.username=
task.persistence.password=
task.persistence.address=
## Zookeeper address
zk.address=
Start service
dos2unix bin/*
sh bin/start.sh
Login
Open a browser and type “localhost:8090”
Configuration
Click “Configuration” -> Cluster Configuration Adds a cluster. Enter the configuration information below:
Cluster name (Hadoop cluster name)
Cluster type Linkis address
Linkis Token
Example
Hint:
Qualitis stores the exception data in the database. The saved database name can be configured in system Settings as shown below:
Qualitis provides an expression for ${USERNAME} as a user name replacement, as shown in the figure. Abnormal data run by different users is stored in their respective databases.
User Manual
< Create project
Once logged in, click the “Rule Configuration” button in the left menu.
In the second level menu below, click the “Project” button to enter the project module.
Then, click the “New Project” button in the upper left corner to enter the “New project page”.
Enter the following information
< 1) Project name
Project name, unique.
< 2) Project introduction
You can create a new project by clicking “OK”.
Run application
