UvA's Sakai Data warehouse architecture

3 March 2006, Victor Maijer

 

Introduction

The concept of data warehouses is relatively new within the world of VLE's.  It is a common concept in other businesses and data warehouses have prove to be valuable for generating strategic (business) information. A Data warehouse in this context is defined as a computing environment where users can find strategic information, an environment where users are put directly in touch with the data they need to make better decisions.

This document describes how the University of Amsterdam has set up their Data warehouse architecture for Sakai.

Architecture

The data warehouse architecture has three components (see figure):

  1. Data acquisition
  2. Data storage
  3. Information delivery

 

 

Data acquisition

This component consists of two major building blocks:

  1. Source Data: raw data that Sakai produces. Examples are 'event' table in Sakai that logs events of users. Other important sources is the session table
  2. Data Staging: Pre-processing of data. It has as input the source data and as output preprocess data. Example of output is a record of number of visitors for a certain course given a certain period.

Data Storage

  1. Data warehouse: A collection of transformed and integrated data, stored for the purpose of providing strategic information for the entire organization
  2. Data marts: A collection of transformed and integrated data, stored for the purpose of providing strategic information for a specific set of users. A college could have its own data mart.

Information Delivery
Several information delivery mechanisms can be delivered in the Sakai Data warehouse. Two mechanisms that will be used are:

  1. Report/Query: Query initiation, formulation, and results presentation are provided to the user. A reporting environment could be a set of preformatted reports.
  2. Datamining: method to search for unknown relations

 

Implication for Sakai infrastructure

The component approach of this architecture makes is possible to implement this in a several ways. Different tools can be used for  'Information delivery'. A reporting tool (that has to be developed) in Sakai could be used. But other (vendor) tools like Crystal Reports or SPSS for analysis could be used in this set up.
Default implementation would be that everything component is implemented into the existing Sakai infrastructure. UvA has decided to implement the data warehouse into a separate infrastructure.

“Data source” form the Data acquisition component and a “report/query” tool will be within the Sakai application. A reporting for within Sakai will be developed by the UvA.
The other components will be put into a separate infrastructure.