Scientific Annotations Middleware Notebook
Help
About Home
DOE2000 Electronic Notebook  

Page: SAM Requirements
Select a by clicking on it ( shows the current selection).

SAM Requirements by Jim Myers - 28 Feb 2002 18:32:11 GMT Hide note
Description: Send comments to Jim.Myers@pnl.gov

SAM Cross-Cutting Requirements

  1. Security

    1. Allow SAM to adopt the security implementations used in a community or PSE.
      1. Address issues of authentication, authorization, encryption, and non-repudiation within SAM to the extent that such issues are addressed within the chosen security implementation.
      2. Implement security capabilities in SAM through standard component-and-service interfaces that hide details of security processes, such as acquiring the user's credentials for authentication and signing, and the underlying security service implementation.
    2. Implement simple username/password authentication.
    3. Work with partners to support stronger authentication via the Grid Security Infrastructure (GSI).
    4. Investigate: Provide mechanisms to address authorization at the level of individual properties.
    5. Investigate: Provide mechanisms to avoid indirect exposure of restricted metadata, e.g. by queries that do not directly return restricted metadata but still expose its existence by inference through the set of objects returned.
    6. Investigate: Document the level of authorization complexity that must be supported in real scientific communities through partnerships with security middleware providers and pilot users.

  2. Events

    1. Expose SAM events via standard mechanism (e.g. JMS, Grid Events, etc.) Initial set to include:
      • events related to the management (creation, modification, deletion, etc.) of objects and associated metadata (important within PSE environments).
      • events related to user actions (login, queries, etc.) and system configuration (allowing integration with system-level logging and auditing capabilities).
      1. Develop initial implementation using existing commercial or open source event service mechanism.
      2. Encapsulate any implementation dependencies if possible.
    2. Investigate: Determine additional SAM events that should be reported externally.
    3. The list of events SAM makes public should be configurable
    4. Investigate: Provide a means for SAM to respond to events from other applications (no example given in proposal. One possibility would be underlying data stores that report new objects created directly (not via DAV/SAM) for which SAM should generate metadata). Work closely with potential SAM users to define motivating use scenarios and to support the appropriate event types.

  3. Architecture

    1. Separate MMS, SS, NS service configuration/administration interfaces from those used directly for accessing/managing metadata, semantic relationships, and notebooks. (I.e. Interfaces for configuring MMS translations and metadata generation capabilities will be separate from the DAV access protocol. This, together with default Web configuration pages for SAM services, could make it possible to set up an MMS service and then run non-SAM-aware applications against it, thus gaining the benefits of SAM query translations and metadata generation capabilities, without any programming.)
    2. Minimize the functionality an application or agent must implement to use SAM.
    3. Build Web configuration and administration pages as the default mechanism for these services.
    4. Implement these pages such that they can be used on a standalone basis, from within a portal, or used as a basis for programmatic access (e.g. use an underlying web service).
 Download samreq.html 

MMS Requirements

  1. Basic Data/Metadata Management

    1. Provide basic capabilities for storing and retrieving arbitrary data and XML/textual metadata.
    2. Provide capabilities for querying metadata to find relevant data sets.
    3. Support queries in multiple or evolving schema.
    4. MMS should meet requirements from users as a ‘stand-alone’ service that enhances DAV and internal requirements for MMS to be a foundation for higher layers of SAM.

  2. Datastore Federation

    1. Support a federated view of multiple, independently managed data stores (i.e. support basic storage, retrieval, and metadata queries across multiple data stores using as though all data were hosted in a single DAV server)
      1. Provide mechanisms to translate incoming queries, forward them to the underlying data stores, and gather the query results.
    2. Provide generic programming mechanism to register underlying data stores with the MMS and associate the schema translations necessary to access them. Implementations should be created for file, relational DB, DAV, and dataGrid stores.
    3. Provide XML configurable implementations for registering relational databases
    4. Investigate: Mechanism for associating metadata in one data store with an object in another store. (Required by CMCS)
    5. Investigate: Extend federation capabilities to allow complex queries to be made across such metadata layers (e.g. queries that would require joins in a relational system).

  3. Metadata Translation/Generation

    1. Provide metadata in community-accepted formats (via translations defined in XML (e.g. XSLT)) independent of original storage format
    2. Provide mechanism to generate new XML metadata from data (ascii, binary, XML, ...)
      1. Provide a means to dynamically register metadata generators with the MMS, allowing metadata to be extracted from data objects to supplement metadata reported directly by applications, and, in the limiting case, allowing legacy applications to use file-like semantics to store data while allowing newer applications to take advantage of generated metadata.
    3. Provide a mechanism to register such mappings (translation of metadata, generation of metadata from data) with the MMS and to invoke them during storage, retrieval, and query operations.
    4. Investigate: provide mechanism for selecting mappings based on user or community preferences.
    5. Investigate: Provide lightweight mechanisms to define dynamically derived metadata (i.e., properties that can be calculated from existing metadata). (i.e. registering a Java object that will calculate a live proerty on demand)

  4. Data Translation

    1. Provide data in community-accepted formats (via translations defined in XML) independent of original storage format
    2. Provide a mechanism to register such data mappings with the MMS and to invoke them during storage, retrieval, and query operations.

  5. General

    1. Work with the Collaboratory pilot projects to define mechanisms that will allow the MMS to be extended to support their needs and also allow the higher level services of SAM to make use of the optimized query capabilities of the underlying data stores. Specifically, in discussions of the SAM concept with potential SciDAC pilot Collaboratory projects, scenarios in which local, private metadata undergoes validation and then migration to a community server were postulated. Similarly, a need for special-purpose indexes to enhance performance for common queries has been discussed.
    2. Implement MMS translation capabilities in an extensible manner, such that translations between RDF schema can be accomplished at the Semantic Services Layer (registration, invocation, storage mechanisms for translations should be independent of the details of the translation process).
 Download mmsreq.html 

SS Requirements

  1. Basic Semantic Relationship Management

    1. Provides a common way of representing data pedigree, annotation, and workflow relationships as well as scientific relationships such as the linkages between a gene and related genes in other organisms, protein(s) encoded, physiological effects, etc. (using RDF, RDF Schema, etc.)
    2. Allow relationships to be the objects of other relationships (reification), allowing expression of a researcher's belief about previously identified relationships, i.e. to encode comments about correctness or importance within notebooks or as part of a community review system.
    3. Allow discovery of semantic information and provide a way of working with such information through generic tools (e.g., within Electronic Notebooks).
    4. Provide a discovery mechanism that will allow researchers and their applications to identify the relationships used within a repository.
    5. Store semantic relationships using the underlying MMS (such that the XML encoding the relationship is visible to an MMS aware client (though it may not be able to interpret the XML as a relationship).
    6. Allow fine-grained mixing of existing (partial) ontologies, thus allowing cross-disciplinary queries within a single data store or across an MMS-federated store.

  2. Advanced Semantic Relationship Management

    1. Work to extend the MMS query language to directly support relational and ontology-based queries.
    2. Explore mechanisms for registering standard relationship names and ontologies and for providing ontological guidance to researchers and developers.
    3. Investigate: Provide mechanisms to register maps between relationship schemas, allowing for customized query translations and evolutionary standardization of schema.
    4. Investigate: Provide mechanism to automatically discover similarities and conflicts in relationships defined by multiple communities and to dynamically generate ontologies that capture the nature of cross-disciplinary interactions.
    5. Investigate opportunities to expand scope based on ongoing develops in the Semantic Web, RDF, OIL, DAML, etc.

  3. General

    1. Provide an architecture that will allow independent development and evolution of metadata types/naming conventions.
    2. Work closely with other SciDAC developers to move toward standard representations for software-generated metadata such as
      • data pedigrees (e.g., experiment parameters, system description, input files, version of software/algorithms used),
      • summary information (e.g., low-resolution subsets, identified features), and
      • relationships to other data (e.g., part of a project or parameter study).
    3. Work with SciDAC end users to define a set of basic, discipline-independent annotation types and semantic relationships that are necessary to represent
      • project plans,
      • hypotheses and conclusions,
      • ideas for follow-on experiments,
      • meeting notes,
      • etc.
 Download ssreq.html 
Requirements for SAM Notebook Services

Requirements for SAM Notebook Services

Electronic notebooks require a variety of data and metadata services that SAM can provide. These are divided into the following categories: Basic services such as pagination and annotation display mechanisms, notebook required metadata such as digital signatures and time stamps, semantic searching capabilities within and across notebooks, and finally long term notebook archiving services required by records management.
  1. Basic Functionality

    1. Implement basic notebook functionality through definition of schema and development of plug-ins for MMS and SS layers.
    2. Provide pagination of scientific annotations and collections into chapters and complete notebooks.
    3. Provide annotation display mechanisms that include fields required by notebooks such as author and date and witnessing.
    4. Investigate mechanisms to support customization of notebook page displays based on device and user preferences.
    5. Notebook Services will be demonstrated and intgration tested using a notebook client to be created from SAM components.

  2. Integration with Metadata Management

    1. Define notebook metadata schema for author names, time stamps, data types, and other notebook specific metadata.
    2. Provide mechanisms for registration and discovery of components for the creation, editing, time stamping, and display of various data types.
    3. Provide registration for components to be invoked automatically on submission. For example, time stamp service and digital signature generation.

  3. Integration with Semantic Services

    1. Define semantic relationship schema for notebook relationships such as chapter, page, commoent on, conclusion about, etc.
    2. Provide capability to query annotations spanning pages, chapters, and notebooks.
    3. Provide search functionality that leverages the semantic services to search explicit and automatically generated metadata associated with notebook annotations.

  4. Records Management

    Records management departments historically have the responsibility to dispense, track, collect, and archive the paper notebooks across a corporation. The lifecycle for a notebook can be 25 years or longer. Electronic notebooks have to meet these same government mandated requirements, but they have the potential to have many of these functions handled automatically through the following SAM Notebook Services.

    1. Provide mechanisms to track notebooks and subsections of notebooks
      1. Provide mechanism for tracking notebooks through their lifecycle by creation of tamper-proof sequential serial numbers associated with notebooks.
      2. Provide mechanism for assigning tamper-proof sequential numbering to chapters, pages, etc. (providing the equivalent of a binding and numbered pages on paper)
    2. Investigate ways of customizing for different record-keeping policies (retention schedules, signing, witnessing schedules, etc.).
      1. Implement default policies based on DOE regulations.
      2. Support CFR21 part 11 technical requirements, perhaps as options.
    3. Provide mechanism(s) for digital signatures (author, witness, notarize, approve, etc.) and timestamps
      1. Create time stamp and signature/notarization as separable components.
    4. Provide a tamper-proof log of configuration changes
    5. Provide mechanism for long-term archiving of notebooks after a project is finished including services to migrate data and signatures from one media to the next while perserving the ability to prove tamper-proofness of contents.
    6. Investigate exposing specific capabilites to applications in a separable manner. For example, an application that requires a legally defensible audit log could access NS digital signature and tracking services without creating a full notebook interface.

 Download NSrequirements.html 

Component/Interface Requirements

  1. General

    1. Components will be developed to simplify the development of applications, portals, problem solving environments, and agents that access SAM services
      1. Components will be developed that simplify the display of SAM-hosted information and the creation of interactive interfaces to SAM functionality
      2. Components will be developed that encapsulate/enhance/simply the use of SAM programming interfaces and protocols.
    2. Components will be designed for use within Java and Web development environments
    3. We will work with SciDAC pilot projects to (further) define, scope, and prioritize components for development.
    4. Programming examples detailing the use of SAM components will be developed to guide efforts within other SciDAC pilots to incorporate them into domain applications and PSEs.
    5. The components will support development of a prototype SAM-based notebook.

  2. Component List

    The following list presents a preliminary set of component concepts that have been developed in response to use cases that have arisen during initial discussion with the developers of other SciDAC proposals, and in internal discussions of notebook interface requirements.

    1. Search Tool – Generic tool for develop of queries and the return of search sets that uses schema discovery capabilities to populate the interface with search terms supported by a given SAM implementation.
    2. Metadata Viewer – Given the URL for a data object within SAM, displays a table with the key:value property pairs associated with it.
    3. Graphical Relationship Browser – Displays a visual representation of data objects and relationship arcs allowing users to browse through data based on relationships such as those defined for data pedigrees.
    4. Notebook Explorer / Table of Contents – Given the URL for a notebook, presents the table of contents for a notebook, producing selection events as users click within the display.
    5. Notebook File Save/Open Dialogs – Given the URL for a notebook, uses the table of contents window to present dialogs for submitting and retrieving data objects to/from a notebook, working analogous to File Open and Save dialogs.
    6. Data Viewer – Given the URL of a data object, attempts to render it based on the viewers registered within a SAM instance for that data type.
    7. Notebook Page View – Given the URL for a notebook page, renders the page view.
    8. Metadata Translator Registration Utility – allows registration of (XSLT) translators with a SAM instance.
    9. Data Signature Widget – allows a digital signature to be applied to a given data object and for existing signatures to be verified.
 Download compreq.html 

bottom