A flexible component that consumes data becomes a key element. Data sources can be external as well as from multiple departments from with in a large organization. External sources could be 3rd party data that might be needed for validation for e.g. credit score or ratings. Internal data it could be data form CRM and ERP modules. The second important task that such a data consumption or ingestion module must perform is filtering inputs based on certain conditions and apply suitable transformations. The third task is to transport this data into any desirable storage or a relay channel.
The place where data originates could be in unstructured format, getting this data inside and store it in a convenient manner that the batch processing and real time process systems can understand and act on it is all about data consumption or data collection or data ingestion.
In Sentienz data platform Half life the Beaver component supports virtually any data format or source, Transactional data Device/Sensor Data be it Application Logs,Access Logs,Files,Tables, Blogs & Social Media, Click stream.
The above process of data ingestion is a complex process, but sentienz data platform simplifies the process by providing a GUI which allows to specify schemas which is especially useful for unstructured source data. Futher the tool allows to specify the end point storage and also specify filters and transformations.