Data Lake for Big Data
Data lake provides a centralised repository to store structured as well as unstructured data with near infinite scale. At Avonshire, we ensure that data lake is designed and implemented for maximum consumption of data for analytics and advanced analytics. We further apply metadata management tools to govern the data lake. We integrate necessary query engine to allow systems and users consume the lake with the most cost-effective design pattern.
You can store your data as-is, without having to first structure the data, and run different types of analytics from dashboards and visualizations to big data processing, real-time analytics, and machine learning and guide better decisions.
Some of the salient features of data lake are data collection in real-time as well as batches, processing with transient elastic map reduce clusters for defining data structure, schema and transformations.
The Lake allows data scientists, data developers, and business analysts to access data with a wide array of analytic tools. It allows data developers to run analytics without the need to move data to a separate system.
It also provides the ability to understand what data in the lake through crawling, cataloging, and indexing. The data is secured with 'encryption at rest' and 'in transit' with AES 256 encrypted keys, which is rotated frequently for improved security.
- Data collection
- Batch ingestion
- Stream and real-time ingestion
- Transformation and schema deployment
- Structured and unstructured query
- Data security
- Real-time analytics
- Lambda architecture
- Micros services based service consumption
- Data lake governance