SOURCE – Sharing Objects Under Repository Control with Everyone
Presenter: David Flanders, Birkbeck
Feb-12-08, Birkbeck, London
SOURCE: A (repository) Bulk Migration ServiceCommon Repository Interface Group (CRIG)
NB: BBK now part of the ‘Bloomsbury Colleges’ at U of London
Opening question: what happens when we have an open content marketplace?
(US context: inquiry into options for replacing textbooks for students)
Idea that content containers have objects within them – we should be in a world where we can place these where we choose
Approached various systems/vendors and asked how could the SOURCE team enable push/pull
- Generic user case: problem with APIs is they are usually programming-language specific. Did not want to create burden of having to maintain those APIs
- Borrowing from OpenContent at MIT – wrapper around API to enable interoperability (cf. developer debate on ‘Jersey style’ vs ‘MIT style’)
- Migration service sits behind – ideally, want to plug that into other services (metadata transformation, object transformation etc)
Repository or other content-based store made up of ‘search, get, put’
The SOURCE world view is that metadata and asset are parts of an object (both are datastreams)
First iteration: Alpha client tool (built in Java Swing)
Interface built for repository managers
Looking for use cases as to what types of migration people might want.
Can create ‘profiles’ to reflect these and can save them
(check out OKI’s ‘wrapper search’)
The team wants someone else to create a ‘crosswalk service’ that SOURCE can use.
- Asset transformation – e.g. taking movie file and pushing it into a Flash file
- Being able to move between IMS CPs and Atom
Second iteration: Beta build (current build).
Team threw away most of Java Swing client – they are now using Ruby
9 months left. Will do user testing.
Sample workflow:
Lecturer uploads digital object (PDF etc) to VLE. (for academics, this is a familiar workflow!) Tool then comes into play: migrating data to archive repository (not an institutional repository). From there, can invoke some workflow processes, potentially involving an institutional librarian (e.g. adding metadata to valuable content; discarding some content; stripping images from PPTs and sending to Flickr, etc).
Real world testing:
- IR able to act as primary deposit with guarantee to deposit in multiple places
- Whole migration from one system to another
- Migration with metadata transformation service being invoked (eg LOM to DC)
- Migration with object transformation invoked, eg IMS CP – APP
- Migration with datastream transformation, eg .mov to .fla
Discussion
100s of migration issues: which ones are the key ones…?
Scott Wilson suggests a potential (plausible, but sensitive) use case scenario: VLE migration (i.e. end-of-licence, managing data export)
Audience question: Has the project looked at national research councils, which have mandated data deposit. There are dataset archiving issues as well. A: yes, the project has looked at “1 to N” migration scenarios and preservation issues (e.g. those associated with the British Library’s “e-theses” project).