The first thing to understand about this code is the motivation behind it. For a long time (maybe since the beginning), Sakai has shipped with a feature in the Site Info tool called "Import from file." The idea is that the maintainer of the site will upload a specially-formatted zip file that contains an archive of all her course materials. The tool works like a wizard, with four screens:
- Asks for a file to upload.
- Shows you a list of categories of content in your archive and allows you to choose the items that should be pulled in.
- Displays a confirmation screen with a summary of what is about to happen and a "Finish" button.
- Upon completion, it displays an "Import Complete" message and an "Ok" button.
The problem with this feature from the get-go was that the only archive files this tool understood were those produced by Sakai's archival functionality and Indiana University's legacy Oncourse system. The initial motivation then was to find a way to substitute other archive file formats for this one, namely Blackboard 5.5 archives in use at Texas State University.
All the code for reading and parsing an archive file was in the Site Info tool code itself, in a 12,000+ line file called
SiteAction.java. The new design needed to factor the archive parsing and processing out into an independent module and provide the flexibility to support new archive types without breaking the existing support for Indiana's files.
UI Drives the Design
For better or for worse, the user interface of this feature dictated some major requirements: the fact that it gives you the opportunity to choose categories of content meant that any new archive parser must be able to provide something that satisfies the concept of content categories. Also, it implies that there must be a parsing step preceding the final import step.
The Archive Module
archivemodule. The existing archive code includes the
ArchiveServicewhich has been used for importing and exporting in Sakai, including handling the archives in "Import from file."
The long-range plan is for Sakai to support the IMS Common Cartridge specification natively for both importing and exporting. The archive module will gradually acquire and consolidate all these capabilities, but for now it contains two distinct pieces: the original archive service, and the new import architecture. This document exclusively deals with the latter portion of this module.
The import side of things consists of five directories:
import-pack. We will discuss each of these in turn, but first let us address a high-level overview of the approach to archive importing.
The Approach and Key Terminology
Thus the key architectural feature of the import code is that it divides the problem into two parts: reading an archive file and extracting the pieces of content out is done by exactly one parser. Taking the content pieces and stuffing them into Sakai tools is done by multiple handlers, one handler for each Sakai tool.
A parser is a class that understands how to read an archive file and extract from it a vendor-neutral collection of content objects.
A handler takes a vendor-neutral content object and stuffs it into a particular Sakai tool. A handler belongs to one and only one Sakai tool. A handler may be called upon many times in the process of importing a single archive.
A parser is written to accommodate all the specific features of a particular archive format. For example, there is a parser for Blackboard 5.5 archives, and another parser for Blackboard 6.0 archives, because the formats are different. There is yet another parser for IMS Common Cartridge 1.0. Note that only the Common Cartridge and the original Sakai format parsers are included in the Sakai release; Other parsers are available in contrib/migration/import.
The key to decoupling the parsing from the handling is that the parser must produce a collection of vendor-neutral content objects. This means that by the time they get to the handler, they are generic, and don't have any vendor-specific formatting. This ensures that the handlers don't have to have any code in them to deal with the particulars of any single format, which in turn means that the same handlers can be used again and again for any archive format, present or future.
These generic content objects still have types which identify what kind of content they are. Some examples of these content object types are:
Announcement, etc. Each content class must implement an interface called
Importable. As a group, these content objects are referred to as importables. An object called an
ImportDataSource acts as a container for the group of importables. We can ask an
ImportDataSource for the categories for that archive, and ask it for the
Importables that match a given list of categories.
An importable is a content object that implements the Interface
org.sakaiproject.importer.api.Importable. Importables are extracted as a collection by a parser, and passed one at a time to one or more handlers to be added to Sakai.
ImportDataSource is an object that acts as a container for the
Importable objects in an archive. You can think of it as the abstract representation of some archive.
Sitting between the parsers and the handlers is the
ImportService. The default implementation of the
BasicImportService, keeps a list of available parsers and a list of available handlers, both of which are Spring-injected and configurable in a
components.xml file. Here is how a hypothetical client of the
ImportService might use it:
Here are two simple data flow diagrams illustrating the two steps of the import process:
- parse an archive file and get back an
- pass a collection of importables to the handlers, which then push content into Sakai
import-apiproject. Here's what the files look like exploded out:
These instructions assume you are using the code for Sakai 2.3.
If you want to configure the import code with Sakai 2.2.x, see 2.2.x Import Instructions
Sakai does not come with Blackboard, WebCT, or any other commercial parsers installed. Commercial parsers live in contrib, and must be installed separately. Copy the parser folders you want from https://source.sakaiproject.org/contrib/migration/trunk/import-parsers into your Sakai source code in the /archive/import-parsers directory. You must then configure, build, and deploy the archive module with the new parser(s). You must add dependencies for the additional parsers to the project.xml file in /archive/import-pack.
Edit import-pack/src/webapps/WEB-INF/components.xml file
By default, the only parser that is configured with Sakai is the original Sakai archive parser. The common cartridge parser is commented out. You can add common cartridge support to import by uncommenting the common cartridge bean.
You also have to add your chosen parsers as properties of the ImportService bean. In this case, again, the Sakai archive parser is enabled by default, and the common cartridge must be uncommented:
A handler is code that works on behalf of a Sakai tool and contributes import content to that tool. You must uncomment the handlers that you want available to the import. The original Sakai input format doesn't use any handlers, because it still uses the ArchiveService to push things into Sakai. The handlers are used by common cartridge as well as any other parsers you configure.
And just like the parsers, the handlers must be set as properties of the ImportService bean:
There are two dependencies in import-pack/project.xml that need to be uncommented if you decide to use the Samigo handler. Again, the handlers don't do anything for the original Sakai archive format, so don't bother uncommenting the dependencies for the Samigo handler unless you are using the common cartridge parser or one of the other parsers available at https://source.sakaiproject.org/contrib/migration/trunk/import-parsers