Copyright 1996 Battelle Memorial Institute. Presented at the IEEE Fifth Workshops on Enabling Technology: Infrastructure for Collaborative Enterprises (WET ICE '96), June 19-21, 1996, Stanford, California, USA. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works, must be obtained from the Battelle Memorial Institute.

The EMSL Collaborative Research Environment (CORE) - Collaboration via the World Wide Web

Deborah A. Payne, James D. Myers [1]

Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, K1-87, P.O. Box 999, Richland, WA 99352, USA debbie.payne@pnl.gov, jim.myers@pnl.gov

Abstract

The Environmental Molecular Sciences Laboratory (EMSL)1 Collaborative Research Environment (CORE) provides an integrated set of internet based collaborative capabilities that appear to collaborators as extensions to the World Wide Web (WWW). CORE is designed to support the virtual enterprise composed of the EMSL and collaborating academic, government, and industrial partners, providing cross-platform capabilities for peer-to-peer, mentor-student, and "producer-consumer" collaborations. CORE provides a simple "one click" method to start or join multi-tool collaborative sessions from a WWW page. Through the use of Common Gateway Interface scripts, sockets communications, and custom multipurpose internet mail extension (MIME) file types, the CORE Session Manager / Desktop Executive pair launch and track active sessions, participants, and tools, giving users the ability to pick collaborative capabilities appropriate for their work without awareness of the connection syntax of individual tools, port numbers, firewalls, or their collaborators' internet addresses, etc. By providing a single, simple interface to a wide range of collaborative capabilities, CORE helps make group collaborations more natural, informal, and effective.

I. INTRODUCTION

The move toward virtual enterprises, seen in today's business world, is also occurring in the field of scientific research. National laboratories, such as the Pacific Northwest National Laboratory (the Laboratory), are making their data, instruments, and expertise, available to academic, industrial, and government collaborators, and conversely, are planning to make use of physical and intellectual resources at other institutions to supplement their own capabilities. While some sharing of this sort occurs today, the use of electronic collaboration tools promises to greatly enhance both the quantity and quality of such collaborations.

The Laboratory's Environmental Molecular Sciences Laboratory (EMSL)[1] is currently developing and deploying electronic collaboration tools through it's Collaborative Research Environment (CORE) project. The EMSL is a new $230M facility for basic research in environmental and molecular sciences in support of the Department of Energy's mission to develop new technologies to clean up the nation's hazardous waste sites. CORE is designed to help the EMSL fulfill it's mission as a collaborative user facility. As part of this project, we are studying the sociology of current scientific collaboration and the current scientific process. This effort is guiding the development of the CORE paradigm and tool suite.

II. EMSL COLLABORATIONS

The EMSL will house many unique facilities for basic scientific research, including the world's first commercial gigahertz Nuclear Magnetic Resonance (NMR) spectrometer, a scanning near field optical microscope, and the most powerful IBM parallel supercomputer yet built. Overall, the EMSL will house nearly 300 researchers with unique expertise, equipment, and software. As part of the EMSL Collaboratory efforts[2], researchers were asked about their current and potential future collaborations in order to understand what types of communications an electronic collaborative environment should support.

Collaborations between the EMSL and other institutions will take many forms. Some will involve researchers in the same field sharing an instrument. The remote researcher might contribute to the design of a new detector and then use the instrument to study molecular systems of interest. In this peer-to-peer type of collaboration, the researchers share a common scientific vocabulary. The most important aspects of their collaborations are shared instruments and unanalyzed data, making remote instrument control and direct data file access important.

A second type of collaboration anticipated is between scientists doing complimentary studies of the same molecular systems. For instance, a theorist may calculate structures of molecular clusters while an experimentalist uses laser spectroscopy to make an experimental measurement of the structure. Researchers in such collaborations share less of a common vocabulary and must often translate their results into each others terms, alternating between the roles of mentor and student. Direct access to instruments or to raw data become less useful to the researchers, while access to summaries and analyses, perhaps recorded into an electronic notebook, and the ability to discuss unfamiliar concepts and to correct misunderstandings become more important.

A third type of collaboration, again involving researchers in different disciplines, involves one researcher, or research team, providing input for another. Examples of this type of collaboration include a mass spectroscopist determining the sequence of a protein or other biopolymer for a biologist, or a surface scientist providing reaction rate data to a geologist modeling the subsurface transport of hazardous wastes. Working with an analytical laboratory on a fee-per-service basis represents an extreme form of this "producer-consumer" type of collaboration. There is often a wider gap between the disciplines and motivations of researchers in such collaborations; a scientist may be interested in a new physical phenomena while their collaborator, an engineer, is trying to reduce the cost of a clean up effort. They may have little chance for professional contact in their daily work or at conferences. Researchers in these types of relationships place the strongest emphasis on being able to receive a sample and information about it, and being able to transmit results back to the other party. However, new ideas and approaches can appear if these researchers communicate more closely. The EMSL and the Laboratory hold seminar series, workshops, and pizza dinner discussions, to foster this type of communication between basic and applied scientists. This suggests that if these researchers are provided with readily available tools for electronic discussions, their collaboration may become more complimentary as they researchers adjust their studies to incorporate new ideas from each other.

It is important to note that while these classifications and examples all relate to scientific research, similar collaborations arise in education and business. Students may ask professors for help while working in teams of peers on projects. Workers might have peer-to-peer collaborations within their organization, and mentor-student or producer-consumer collaborations with suppliers and customers. Thus, software that is designed to support scientific collaborations will be applicable in other domains as well.

III. THE NEED FOR A COLLABORATIVE ENVIRONMENT

During any collaboration, communication naturally switches between media as appropriate. When people are talking together, they may simply frown or shrug to convey certain points. Or they may use prepared graphics to support their discussion, or use a whiteboard for text and drawing. A concept that is very difficult to convey through speech may quickly be made clear with a drawing or a demonstration. Similarly, reference materials or lectures can be used to convey some ideas, while interactive discussions and hands-on work are required to communicate others.

To support collaborations electronically, one must support a variety of communication media, and varying degrees of interactivity, and the ability to switch naturally between media. Many tools for communication via computer have been developed over the years. Some, such as the file transfer protocol (FTP), and the World Wide Web(WWW), allow users to access posted material at their leisure. Others allow many users to passively observe a live event, such as NASA's multicast backbone (MBONE) [3] video broadcasts of space shuttle missions. E-mail and newsgroups allow users to serially interact, while videoconferencing software allows real-time interaction between people. In Fig. 1, we've plotted a sampling of existing tools and ones under development on two axes - the degree of synchronicity and the degree of interactivity of the tools.

While there is certainly overlap in the capabilities of these tools, no one tool supports the full spectrum of capabilities, and each tool provides unique benefits for certain collaborative activities. Activities in scientific collaboration range from discovering research capabilities and receiving training, to performing experiments and analyzing results, to resolving problems during these activities through discussion, and finally preparing results for presentation or publication and beginning the cycle again. Each individual collaboration will include many of these activities, perhaps in different proportions in the different types of collaboration, and therefore will require more capabilities than a single tool can provide.

Figure 1. Categorization of tools for electronic collaboration
The purpose of a collaborative environment, such as CORE, is to provide users with a single simple way to access multiple electronic collaboration capabilities independent of their computer platform. In addition to providing a simple way to select collaborative capabilities, the environment should hide the different syntax each tool has for launching and connecting to collaborators. The environment helps make collaboration natural. Computer addresses, port numbers, and firewalls: all disappear from the user's view. Indeed CORE makes connection simpler than a phone call - users start and join sessions using their names and a short topic description.

IV. CORE ARCHITECTURE

CORE is a multi-platform environment implemented using the World Wide Web (WWW) for its main user interface and session management communications. CORE uses stand-alone tools, or combinations of compatible tools to provide cross platform capabilities to the user. With the rapid progress being made in collaborative computing, CORE's ability to incorporate the latest third party tools, and to quickly add new tools as they are developed, provides a great benefit to the user.

CORE relies on a central session manager and desktop executives that coordinate communications between participants and configure the various collaborative components. Use of the WWW paradigm makes the system very easy for users to understand. The main interface of CORE is a WWW page that allows users to start or join collaborative sessions via a WWW form. This page, shown in Fig. 2, uses a common gateway interface (CGI) script to process user input. To start a new session, the user enters their name in the "User Name" text box and a brief topic description in the "Session Name" text box, selects the capabilities desired from the list by marking the appropriate checkboxes, and clicks on the "Start a New Session" push button. To join an existing session, the user enters their name and clicks the button showing the desired topic in the "Active Sessions" list.

Figure 2. The CORE WWW user interface
The CGI script processes the user's input and connects to the CORE session manager via TCP/IP sockets. The session manager is a "C" language server running continuously on a workstation (not necessarily the WWW server workstation where the CGI script runs). The session manager tracks who is in each session and what tools they are using. Information required to connect the CORE desktop executives as well as to connect the selected tools is maintained by the session manager and returned to the CGI script.

When a new session is started, the session manager may start server processes for some of the tools, such as the EMSL TeleViewer described below. For other tools, such as videoconferencing, the user's IP address and platform type are used to determine the appropriate parameters for launching the client videoconferencing software. In our environment, we have implemented a CU-SeeMe[4] reflector bridge across the Laboratory's firewall. Macintosh and PC users use CU-SeeMe, connected to the appropriate end of the bridge to conference. UNIX users can connect to the bridge using nv and vat [3]. The session manager determines the appropriate parameters to launch software on each user's machine. The session manager can also limit the total number of sessions, or the number started from certain addresses, e.g. from outside the Laboratory, as desired.

Once all the connection information is determined, and appropriate servers are started, the CGI script sends a custom MIME file to the user's browser. The CORE desktop executive is started as a viewer (helper application) for this custom MIME file, just as a video player is started to "view" a video/mpeg MIME type movie file. Others have used the MIME mechanism, without using a CGI script, for launching real-time collaborative tools from the WWW, allowing connections to a predefined mentor.[5]. Use of a script extends this method to allow dynamic connections between any group of users.

Figure 3. CORE communications pathways. CORE uses standard mechanisms to talk to the WWW (upper left). The branched arrow at the left signifies platform specific launching mechanisms while single arrows represent current (solid) and future (dashed) connections via sockets.
If a session is being started, the executive opens a socket on an unused port number and begins listening for other executives to connect to it. The executive also connects to the local WWW browser, using one of that browser's standard mechanisms. The executive uses the local browser to report the chosen listening port number to the session manager via a second automatic call to the CGI script. The CGI script responds to this message by updating the "Active Sessions" list on the CORE WWW page. If a session is being joined, the IP address and port number of the user who started the session are included in the MIME file. Joining executives use this information to connect with the original executive in a star topology.

A schematic of CORE's communications pathways is shown in Fig. 3. The standard WWW communications pathways, and those of CORE's integrated tools are also shown, as are connections to the resources within the EMSL.

V. COLLABORATIVE TOOLS

Once the an executive has prepared it's own communications, either opening a listening socket, or connecting to a listening executive, it launches the requested collaborative tools. CORE provides a basic set of tools deemed necessary for distributed collaboration. Some of the tools have been developed as part of the CORE project and are highly integrated with the CORE executive, while others are the product of other EMSL Collaboratory projects and third party efforts and use their own communications once launched. A brief description of each of the capabilities follows:

A. WebTour
WebTour provides the ability to synchronize WWW browsers, allowing users to hold lectures or discussions, using material on the WWW. WebTour can be run in either lecture (only the leader's browser is echoed) or peer-to-peer modes. The WebTour functionality is embedded in the CORE executive and uses its communications to the browser and to other executives.

B. File Sharing
CORE provides file sharing as an extension to the WebTour. Any local files opened in the user's WWW browser are transmitted to collaborators and opened with their browsers. Because it uses the WWW's browser/viewer mechanism, it allows remote users to choose different applications to view transferred files, i.e. users may choose different word processors to view a rich text format (RTF) file.

C. Chat Box
A simple chat box is included in the executive as well. Messages are tagged with the user names given when starting the session. Proper serialization is guaranteed by sending all messages to the central executive (the one that started the session) which then redistributes them to all executives in the session.

D. TeleViewer
The EMSL TeleViewer[6] provides a cross platform shared computer display. Users may select a rectangle or window from their computer, or their entire display to share with collaborators. Using this tool, users can view any program running on the shared display, such as word processors, spreadsheets, instrument control software, and mathematical computations. The display is repetitively differenced, compressed, and shared via the TeleViewer's independent sockets communications system. The TeleViewer will soon provide annotation on top of the live image and eventually the ability to remotely control the shared application.

E. Electronic Notebook
The EMSL Electronic Laboratory Notebook (ELN)[7] provides users with a shared version of the traditional paper laboratory notebook. The current system creates a dynamic, user queriable, WWW page consisting of datafiles, an image or live Java based view of the data in each file, and information about each file (instrument parameters that were used, the operator's name, the date, etc.). Information from EMSL instruments that is sent to our enterprise database/archive system is queried via CGI scripts to provide automatic updates to the ELN. Users may select a subset of files to be displayed by sample, date, and owner name. Users also have the capability to add text, picture, and file annotations to the original information. Further enhancements to the ELN, based on Java[8] and CORBA[9], will increase the types of information that can be stored in the notebook.

F. On-line Instruments
Other projects within the EMSL are developing on-line instruments that can be run remotely via the internet. CORE provides a mechanism to select and launch this software, as well as providing a notebook for storage of the data acquired. One of the first of these instruments is a remote enabled radio frequency ion trap mass spectrometer.

G. Whiteboard
Whiteboards provide a shared space where users can write and draw. Currently CORE can launch the wb whiteboard[3] on UNIX platforms. Whiteboard support is not presently available on other platforms.

H. Audio/video conferencing
Audio/video conferencing allows collaborators to see and hear each other, as well as to monitor instruments and laboratories. CORE currently launches CU-SeeMe or nv and vat, depending on the user's platform. As part of the CORE project, the Laboratory set up a CU-SeeMe reflector bridge across our firewall that allows conferencing between EMSL researchers and external colleagues, while managing security.

VI. CORE USAGE

CORE is being used in several groups at present. The EMSL Collaboratory team uses CORE to work with collaborators in the Laboratory's Information Technologies department (located in another building) and with collaborators at Evergreen State College in Olympia, Washington, working on the electronic notebook. CORE has been demonstrated for the US Army between a site in Korea, and ones in Ft. Lewis and Richland, Washington, in combination with other Laboratory technologies for WWW based distance learning.

CORE has also been used to provide a remote lecture to Professor Jim Callis' Chem. 155 class at the University of Washington. The students were taught mass spectroscopy via videoconference and the WebTour by Dr. John Price at the EMSL, and then used his ion trap mass spectrometer remotely. Their data was instantly available to all participants via the WWW based notebook. The ion trap mass spectrometer is also the focus of a research collaboration with the University of Washington and CORE is being used for communication between the two sites.

Several opportunistic users of CORE have begun doing business presentations via CORE, relying on the WebTour to lecture based on material on the WWW. While this is not a scientific research use, it does provide valuable feedback, as well as point out CORE's applicability in fields other than science.

VII. LESSONS LEARNED

CORE has been developed over the past twelve months, a time in which the internet and the WWW have changed greatly. In particular, the emergence of Java promises to revolutionize the development of dynamic WWW interfaces. However, the use of the WWW as a central interface to a collaborative environment has already proven to be a good choice. Researchers who are shown CORE immediately understand how to use it, and do not show the same hesitancy to try it that users of an earlier non-WWW prototype did. Making collaborative tools accessible through a bookmarked WWW page reduces the barrier to using them.

Additionally, the concept of a comprehensive environment for collaboration has also shown promise. Users who begin with interest in one tool that solves a pressing need experiment with other tools over time. These experimental uses may begin to influence and broaden the planned use of CORE in collaborations.

From a developer's standpoint, the use of the WWW makes support of multiple platforms easier. The main CORE interface is available across Macintosh, Windows, and UNIX platforms, while many of the tools we would like to use are not. With Java, the effort required to develop and maintain CORE across multiple platforms should be reduced further.

Using WWW communications has had positive and negative aspects. The use of a CGI script to communicate with a separate session manager allowed us to develop a WWW interface for CORE without major modifications to existing session manager code. However, the stateless nature of WWW connections forced us to use hidden fields on forms (cookies were not available on all of the browsers we wished to support when the project began). While workable, this solution is not very elegant.

One area of continuing difficulty with the CORE approach of launching third party tools is the lack of standard interfaces to such tools. Even the major WWW browsers do not have a single standard for sending and receiving universal resource locators (URLs) to and from a local application. The National Center for Supercomputing Applications (NCSA) has developed a sockets based common client interface (CCI) for Mosaic that is used by some browsers, while Netscape provides platform specific mechanisms for it's Navigator (and they even provide two mechanisms under Windows, using DDE and OLE, that provide slightly different functionality). Similarly, while CU-SeeMe will accept only an IP address to connect to on the command line, other tools, such as the wb whiteboard and the EMSL TeleViewer will also accept session names and other configuration information on the command line, making it much easier to integrate them. Unless standards emerge in this area, adding a new tool to CORE will always require additional programming.

VIII. CONCLUSIONS

Collaborative environments, such as the EMSL's CORE, can provide users with a single interface to the wide range of collaborative capabilities required in scientific research collaborations. Further, collaborative environments can integrate tools, or sets of compatible tools, to provide these capabilities across computer platforms. By hiding the complexities of configuring individual tools, collaborative environments can reduce the barrier to communicating with remote colleagues. Further improvements to usability of the environment can be made by using the WWW as the environment's interface, leveraging users' understanding of browser and helper applications to make electronic collaboration capabilities appear as simple extensions to the WWW.

IX. REFERENCES

[1] "EMSL Home Page", Pacific Northwest National Laboratory http://www.emsl.pnl.gov:2080/homes/homepage.html

[2] "Collaboratory for Environmental Molecular Sciences", Pacific Northwest National Laboratory http://www.emsl.pnl.gov:2080/docs/collab/

[3] "Introduction to Videoconfrencing and the MBONE", C.T. Larsen, Jan. 9. 1995, http://www.lbl.gov/vconf-faq.html

4] "CU-SeeMe Welcome Page", Cornell University, http://cu- seeme/cornell.edu/

[5] Frivold, Thane, Ruth Lang, and Martin Fong, "Extending WWW for Synchronous Collaboration" in "Proceedings of The International WWW Conference '94: Mosaic and the Web," Chicago, IL, 17-20 October 1994, pp 333-341. (Also http://fs2/WWW94/ExtendingWWW.html).

[6] "The EMSL TeleViewer: A Collaborative Shared Computer Display", P.E. Keller, J.D. Myers, submitted to the IEEE Fifth Worhshops on Enabling Technologies: Infrastructure for Collaborative Enterprises, June 19-21, 1996, Stanford U., Palo Alto, CA

[7] "Electronic Laboratory Notebooks for Collaborative Research", J.D. Myers, D. Le, J. Laird, C. Fox-Dobbs, D. Reich, T. Curtz, submitted to the IEEE Fifth Worhshops on Enabling Technologies: Infrastructure for Collaborative Enterprises, June 19-21, 1996, Stanford U., Palo Alto, CA

[8] "Java: Programming for the Internet", http://java.sun.com [9] "What is CORBA???", http://ruby.omg.org/corba.htm

X. ACKNOWLEDGEMENTS

This work was supported by the U. S. Department of Energy through the Distributed Collaboratory Experiment Environments (DCEE) program sponsored by the Mathematical, Information and Computational Sciences Division of the Office of Energy Research, and through the Laboratory Directed Research and Development program at Pacific Northwest National Laboratory (PNNL). PNNL is a multiprogram national laboratory operated by Battelle Memorial Institute for the U.S. Department of Energy under Contract DE-AC06-76RLO 1830.