Earn recognition and rewards for your Microsoft Fabric Community contributions and become the hero our community deserves.
Learn moreSee when key Fabric features will launch and whatโs already live, all in one place and always up to date. Explore the new Fabric roadmap
Hi all!
I would like to generate a table that contains historical information for the last 10 years using CSV files and a dataflow that is linked to a on-premise server. This view from the server gets updated every month.
How would you do that?
Store CSV in Sharepoint and consolidate using dataflow and then build a table based on both dataflows?
Or would you store CSV in Onelage and to something in there?
Solved! Go to Solution.
If you already have CSV files you don't need a dataflow. Keep ingesting them directly into your semantic model.
If you are considering OneLake or Fabric then another option would be to convert the CSVs to Parquet. Then you could access them via Direct Lake connection.
I would upload the historical CSV files to OneLake. To bring in current data, I would connect the on-premises SQL Server using a Data Gateway, loading the data into the same Lakehouse or dataflow environment. Once both historical and current data are available, you can union and store the results as a delta table or a Warehouse table, allowing for scalable, long-term querying.
I think this method is well-suited for large datasets, as OneLake offers better performance and scalability compared to SharePoint. It also provides unified storage, meaning all your data resides in a single, centralized environment, which simplifies data management. With delta format support, this setup is ideal for incremental data processing and advanced analytics. Additionally, it is more future-proof, since the data can easily integrate with other Fabric capabilities such as Spark and SQL analytics tools.
Hi @joshua1990,
we would like to follow up to see if the solution provided by the super user resolved your issue. Please let us know if you need any further assistance.
If our super user response resolved your issue, please mark it as "Accept as solution" and click "Yes" if you found it helpful.
Regards,
Vinay Pabbu
Hi @joshua1990,
we would like to follow up to see if the solution provided by the super user resolved your issue. Please let us know if you need any further assistance.
If our super user response resolved your issue, please mark it as "Accept as solution" and click "Yes" if you found it helpful.
Regards,
Vinay Pabbu
Hi @joshua1990,
we would like to follow up to see if the solution provided by the super user resolved your issue. Please let us know if you need any further assistance.
If our super user response resolved your issue, please mark it as "Accept as solution" and click "Yes" if you found it helpful.
Regards,
Vinay Pabbu
I would upload the historical CSV files to OneLake. To bring in current data, I would connect the on-premises SQL Server using a Data Gateway, loading the data into the same Lakehouse or dataflow environment. Once both historical and current data are available, you can union and store the results as a delta table or a Warehouse table, allowing for scalable, long-term querying.
I think this method is well-suited for large datasets, as OneLake offers better performance and scalability compared to SharePoint. It also provides unified storage, meaning all your data resides in a single, centralized environment, which simplifies data management. With delta format support, this setup is ideal for incremental data processing and advanced analytics. Additionally, it is more future-proof, since the data can easily integrate with other Fabric capabilities such as Spark and SQL analytics tools.
If you already have CSV files you don't need a dataflow. Keep ingesting them directly into your semantic model.
If you are considering OneLake or Fabric then another option would be to convert the CSVs to Parquet. Then you could access them via Direct Lake connection.
User | Count |
---|---|
13 | |
5 | |
3 | |
3 | |
3 |
User | Count |
---|---|
8 | |
8 | |
7 | |
6 | |
6 |