call to discuss more.
I have problem statement,
There are two batch aplications,
1. *Hadoop based hive database*, holds transaction for Asian client, Millions of records
2. *Sql server database*,holds transactions for US client, Millions of records
There is new business requirement, where we need to update transaction records to change *Settlement indicator* column in both database based on common rules/logic. As business rules are same for both Asia and US aplications, we like to create central *common python program* to update settlement indicator like following,
*Asia Hive:*
1. Asia Transaction load job will run, which will ingest Asia transactions in hive table
2. In next job,Asia Hive transactions ges feed to python program
3. Python program will generate settlement indicator value which will get change in Asia Hive table
*US SQL:*
1. US Transaction load job will run, which will ingest US transactions in sql table
2. In next job,US SQL transactions ges feed to python program
3. Python program will generate settlement indicator value which will get change in Asia sql table.
From sql side, its sorted,
*There are challanges in hive side,*
1. We cannot create stored proc in Hive
2. As direct table read is not recommended in python, How to feed Hive transactions to python.Do we need to replicate data from Hive to some other source like file, another table which python will read?
3. If somehow python able to read Hive transactions, how update will happen? As Hive dont support direct update, how to update settlement indicator created by Python.
too much text. please draw a diagram to show how the data flow, and where you want the data to be processed
Can we connect please, I will explain
nope. use the group
if you send it in group there are higher chances of javing your problem solved btw 😇
Обсуждают сегодня