Data Analytics with GeoIdentity

Data analytics Geoidentity

Data analytics allows organizations to leverage their data assets to develop insights that can bring tremendous business value to tackle complex problems. At GeoIdentity, we empower businesses in the development of turnkey solutions that turn large, disorganized datasets and systems into effective decision making platforms.

Development of decision making platforms requires close engagement with clients on data assets survey, formulation of requirements, implementation architectures and organization wide success criteria. 

We describe a case study where GeoIdentity helped a large public utility company succeed in implementing a data driven business analytics system.

Goal

The goal was to implement a data warehouse and analytics program to help and leverage the business with their strategic, tactical, and operational activities and to transition the large utility partner to a data driven organization.

Expectations 

Our stakeholders engagement and expectations with the Data Warehouse were:-

  • C-level executives
    • Won’t be directly working with the data warehouse, but they shall be making the strategic decisions based on information and reports derived from it.
  • Senior Managers and Directors
    • Shall provide a focused and tactical perspective on data usage to support the business-unit level activities.
  • Department-level power users, Analysts and Managers
    • Shall have in-depth understanding of data manipulation, analysis and shall prepare insights leveraging the data warehouse.

Requirements 

Business Requirements 

  • Reduce non-transactional interactions against OLTP or production / transactional systems for analytical, (non-operational) integration and reporting.
  • Establish and Manage a Master Data Repository.
  • Additional support for real-time analytics and support for streaming data, geospatial data, etc. 
  • Meet additional data warehousing needs – application logs storage, storage needs for multiple streaming datasets, etc.
  • Reduce maintenance and burden on IT staff to manage, maintain, and operate systems.

Technical Requirements 

  • Encryption and Security of data, meeting FedRAMP, ISO/IEC 27001, and NIST 800-53 requirements.
  • Access control, granular level permissions, and Compatibility with SSO access at SA.
  • Minimal maintenance load on IT staff – SaaS-based solution is preferred.
  • Low-cost solution over a 5 year period (setup and usage over a 5 year period) with transparency in pricing and cost management options.
  • Ensure Dell Boomi IPaaS and Tableau reporting support.

GeoIdentity’s Approach

We examined thoroughly every information provided by our stakeholder and came up with the best approach to the project. 

To bridge the gap between mass accumulation of data in source systems and comprehensive views of information we used Data warehouses (and associated architecture) to create a shared insight and institutional knowledge as highlighted in the following diagram.

Data Analytics, GeoIdentity.com

 

We transitioned the Vertical RDBMS to Cloud-based Horizontal Scaling which allowed the data warehousing environment to enable provisioning of additional servers to meet needs, by splitting the workloads between all servers to balance requests. This approach allowed rapid scaling at lower cost rather than scaling up any singular resource needed with their on-premise OLTP and warehouse systems.

Scalability –  We implemented low cost infinitely scalable commodity hardware that can be managed at scale for low cost.

Data models –  We Stored objects together by family which allowed rapid access, atomic units for updates. Sharding (breaking it into separate nodes based on ID fields etc. ) helped the data objects to reside on multiple clusters of machines that can own the components of the data objects.

 

The data models were redesigned to leverage the columnar storage options as highlighted in a sample illustration below. 

Data Analytics, GeoIdentity

Provisioning additional servers to meet demand needs and splitting workloads between all servers to balance requests, rather than scaling up any singular resource.

Read More