Pipeline Setup for Data Import
-
Document the Data Import Pipeline:
Outline the steps required to set up the data import pipeline. -
Identify Missing Workers:
Compile a list of workers that are not included in the current implementation. -
Determine Appropriate Module:
Identify the module where each worker should be located.- If the necessary module does not exist, create a new one.
-
Define State File Format:
Specify the state file format that each worker should operate with. -
Evaluate Existing Options:
Determine if the current options are sufficient to complete the scenario.- If not, additional development of the
generate-state
function is required.
- If not, additional development of the
-
Update Vault (if applicable):
- If Vault is used, add the necessary data to Vault.
-
Create Input Files for the Pipeline:
Generate input files for the pipeline, considering the missing workers. -
Run State Generator for the First Input File:
Execute the state generator on the first input file. -
Reach Missing Worker via Workflow Manager:
- Proceed to the missing worker by running the workflow manager.
-
Create Worker Stub for Payload Display:
- Develop a stub for the worker to display the payload.
-
Write Missing Worker Script:
Develop the script for the missing worker. -
Run Workflow Manager Post-Debugging:
- After debugging, restart the workflow manager.
-
Create Production Input Files:
- Create input files for production (daily and historical data).
-
Set Up Cron Job for Daily Tasks:
- Schedule the daily tasks using a cron job (pending update to version 1.9.0).