The process of connecting a new data source to your data warehouse can often happen quickly if the right information and permissions are provided. What may take longer is ingesting (importing) the data. The time this part of the process takes is directly affected by the amount of data that needs to be ingested.
To get started, we will need the following:
- Name of the source/connection
- Access credentials if we do not already have them
- Type and scope of data needed
- Data sources often can have hundreds or even thousands of tables. In nearly all cases, we will not need to ingest all these tables in order to achieve the types of insights you are hoping to get. It is important to limit the scope of the connection to just the data required in order to keep usage costs to a minimum.
- Historical depth needed
- The source system we are connecting may have data going back years. Depending on the amount of data, this can drastically affect the usage costs and timeline for data ingestion. Let us know how far back we need to go when pulling the data.
- Update frequency
- Is this a one-time pull or do you need the data updated from this system on a regular basis? Costs generally rise the more frequently you need an update so an update every five minutes is usually pricier than a daily update. If you have questions about how update frequency will affect usage costs, we can provide a ballpark estimate of different update frequencies once we have a better understanding of the data set.
- Reports and dashboard
- Once you have the data available, do you need it incorporated into an existing report/dashboard or a new report or dashboard?