问题
我认为这是非常普遍的问题,希望有解决方案/方法,我们可以重用。
我们正在Azure ADLS gen2中构建数据湖,具有单向数据流:Nifi/ADF -> ADLS -> ETL/Spark/Databricks -> Data Warehouse -> Power BI。一些ETL输入应由负责的业务用户每周/每月加载/更新。
您能否建议/改进业务用户上传符合以下要求的ETL输入的解决方案?
要求
需求成果估计从1(非常糟糕的实践方法)到5 (100%干净,易于实现的解决方案)。
可能的解决办法
Business users -> Power Apps -> Data Warehouse & Stored Procedures -> ADLS -> Spark -> Data Warehouse -> Power BI.- 1 requirement = 5. Very user friendly interface built with Power Apps.
- 2 requirement = 2. Poor implementation of validations/transformations in SQL stored procedures. All other application code is written in Spark.
- 3 requirement = 3-5. Not sure how to implemenent this yet.
- 4th requirement = 2. Data flow becomes bidirectional means `DW -> ADLS -> DW`. Harder to reason about and orchestrate.
Business users -> Microsoft Storage Explorer app -> ADLS gen2 -> Azure Blob Storage trigger -> Azure Function -> Spark parsing/validation job -> ADLS gen2- 1st requirement = 3-4. Uploading through Storage Explorer is very user friendly, the only issue is that the mechanics for notifying user about success/failure can be done through email and may not be very clear.
- 2nd requirement = 5. I like parsing/validation happen on ETL side, not on Data Warehouse stored procedures.
- 3rd requirement = 1-3. Isn't clear hot to achieve that currently. Expecting it'll be worse than with Power Apps.
- 4th requirement = 4. Unidirectional process, data aren't moved from DW to Data Lake. 4 not 5, because it isn't very clear for business user that success/failure notification will come through email. Also little bit bigger complexity of implementation.
发布于 2020-04-30 11:04:47
现场
据我了解,您正在寻找具有以下特性(按优先级排序)的单向ETL流程/体系结构:
用户友好性的定义也有点模糊,因为用户最终会习惯于使用不直观的工具,例如,强迫他们参加课程。我知道业务用户对使用Power一点也不感兴趣,但是他们没有选择不使用它。
建议
我的一般经验是,定制的前端,为业务需求量身定做,比让他们使用一把庞大的瑞士军刀更快乐。瑞士军刀在用户的日常业务中只使用了几个功能。我从未见过PowerApps被使用过,但我从他们的网站和https://alternativeto.net/software/microsoft-powerapps/那里得到的是某种低代码的应用程序/用户界面构建平台。
我个人倾向于使用大型用户社区(如Tableau、Qlik或Appian )的低代码工具。我与其中任何一个都没有关联,但我成功地将它们都连接到一个SQL数据库上,这就是我提到这三个数据库的原因。
你说过
存储资源管理器具有足够的用户友好界面,实现成本为零。唯一关心的是验证--反馈和审计。为了减少验证,我们可以为业务用户创建一些Excel模板。
所以我想你可能会支持这个解决方案,但我绝不会牺牲审计。与备份类似,通常只有在发生事故时才知道没有备份或审计的真正代价是什么。在网络攻击或白领犯罪的情况下,企业通常非常需要日志文件。
https://stackoverflow.com/questions/61340510
复制相似问题