Copyright 1996 IEEE. Published in IEEE Computer, Volume 29, Number 8, August 1996. This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.


Collaboratories: Doing Science On The Internet

Richard T. Kouzes
West Virginia University

James D. Myers
Pacific Northwest National Laboratory

William A. Wulf
University of Virginia


The success of many complex scientific investigations hinges on bringing the capabilities of diverse individuals from multiple institutions together with state-of-the-art instrumentation.

We are all aware of the tremendous impact computers have had on science and engineering in the past 50 years, but the impact in the near future may be far greater. A 1993 National Research Council study suggested that

The fusion of computers and electronic communications has the potential to dramatically enhance the output and productivity of U.S. researchers. A major step toward realizing that potential can come from combining the interests of the scientific community at large with those of the computer science and engineering community to create integrated, tool-oriented computing and communications systems to support scientific collaboration. Such systems can be called "collaboratories."1
   The term collaboratory was coined by William Wulf while he worked for the US National Science Foundation. Wulf merged the words collaboration and laboratory, and defined a collaboratory1 as a
...center without walls, in which the nation`s researchers can perform their research without regard to geographical location - interacting with colleagues, accessing instrumentation, sharing data and computational resource, and accessing information in digital libraries.
    To what degree can we realize this potential? Computer scientists working with domain specialists have made progress on several fronts to create and integrate the tools required for Internet-based scientific collaboration. However, both technical and sociological challenges remain.
    Collaboration is at the heart of science, with a tradition spanning centuries. However, while science has benefitted greatly from the computing revolution of the last four decades, technology`s impact on the collaborative process itself has been insignificant when compared to our expectations for the near future.
    Scientific collaborations currently rely heavily on face-to-face interactions, group meetings, individual action, and hands-on experimentation. Group size varies widely, from as few as three people in molecular chemistry to 300 in high-energy physics.
    The tools of computer-supported cooperative work are now being applied to such collaborations.2 Through immersive electronic interaction, team members distributed across a widespread area can collaborate, using the newest instruments and computing resources. A new paradigm for intimate collaboration among scientists is thus emerging that will accelerate the development and dissemination of basic knowledge, optimize the use of research instruments, and minimize the time between discovery and application.
    A collaboratory facilitates scientific interaction within a team by creating a new, artificial environment in which individuals can interact. This new place must be socially acceptable to the people who participate and improve their ability to work. Many computing tools must be brought together and integrated to allow seamless interaction. Some of these tools are already in wide use, such as electronic mail and the World Wide Web, while others, like telepresence - the immersive electronic simulation of "being there" - are still being created by researchers.

COLLABORATORY PROTOTYPES
    To facilitate scientific work, collaboratory systems must support the sharing of secure data, analysis, instruments, and interaction spaces.3,4 Several systems incorporate these basic components.5
    An advanced example of a data-driven collaboration is the Worm Community System, one of the original collaboratory projects sponsored by the NSF in 1990.6 This system supports researchers studying the nematode C. elegans, which is a harmless, soil-residing worm of little human significance. The Worm Community System provides a repository for data about the nematode, ranging from genome to behavior level, and ties this data to the literature. Everything known about C. elegans and everyone contributing to this knowledge is accessible through the system. These capabilities elevate the Worm Community System from a simple tool for sharing data to an electronic forum.
    Providing access to scientific instruments from distant locations is another common focus of collaboratories. Early collaboratories focused on the sharing of large, expensive instruments such as astronomical telescopes, particle accelerators, oceanographic instruments, atmospheric observatories, and space research applications. The Upper Atmospheric Research Collaboratory, another NSF-funded project, is an example. UARC provides six institutions access to instruments in Greenland for solar wind observation. (For more information, see http://www.si.umich.edu/UARC/HomePage.html). UARC collaborators exchange and archive multimedia information from the instruments and the measurements analysis. Other collaboratory projects share smaller devices, such as electron microscopes, scanning tunneling microscopes, and nuclear magnetic resonance instruments.
    Developing a shared interaction space across several laboratories is the ambitious goal of a new initiative from the US Department of Energy. The DOE's Distributed Collaboratory Experiment Environments Program involves four major projects, briefly described in the "Distributed Collaboratory Experiment Environments" sidebar. These are the first efforts of a major program to develop the technology for a virtual laboratory system that encompasses the scientific resources of the national laboratory system. The goal is to enable scientists around the world to participate in solving DOE`s science and technology challenges.

SOCIOLOGY OF COLLABORATION
    Facilitating collaboration among a widely distributed scientific community is highly complex. although a collaboratory is potentially nothing less than the village square of the Information Age, it is a synthetic place requiring social adaptation.
    Is such a place socially sustainable? The requirements for a technological system's success seem contrary to the sentiment expressed in the motto for the 1933 Chicago World`s Fair: "Science finds, industry applies, man conforms." Today we expect technology to adapt to the user. Some people contend that "technology is incompatible with a gentle and humane society,"7 but we believe that technology implemented with an awareness of human needs can facilitate the collaborative process.
    Asserting the social acceptability of a synthetic "place" does not make it so, of course. Thus, in addition to purely technical issues, the research agenda for creating collaboratories must address fundamental psychosocial questions as well. Is it possible to electronically create a suitable sense of place that permits, and enhances, the successful cooperation of dispersed individuals toward common goals? How can we support communication that permit human cooperation even when the evolutionary social mechanisms that depend on proximity are absent?
Asserting the social acceptability of a synthetic "place" does not make it so, of course. Thus, in addition to purely technical issues, the research agenda for creating collaboratories must address fundamental psychosocial questions as well.
The silent language of body motion and spatial position are central to human communications and social control; can this richness of human interaction be provided? Indeed, which aspects of this richness are critical and which are not?
    Collaboratory developers must consider psychosocial issues such as autonomy, trust, sense of place, and attention to ritual. Autonomy, which describes how an organization is governed or regulated, is implemented through informal communications, acquaintances, and associations. Collaboratory developers must embed autonomy into the virtual organization in a considered manner. Trust, which is established among collaborators through shared experience, is implemented over time through informal means such as meeting face to face and working together in the same place. In a collaboratory, trust will have to be established "through some special means. A sense of place, which allows people to feel comfortable in their surroundings, provides security so that people can feel creative. If a collaboratory can harness some of the design strategies that have been so successful in physical group settings, it can also create a sense of place and purpose among its dispersed members that will engender an enduring sense of affiliation and cooperation toward its goals. The mechanisms of ritual, which moderate our interpersonal interactions, must find a place in the synthetic surroundings of a collaboratory.
    Technology solutions abound, but often fail to find a human problem to solve. Groupware applications, like those for meeting scheduling, group decision support, joint authorship, and distributed management, have had mixed success. The failure of groupware to gain wider acceptance is due to its primitive technology and its insensitivity to social and political issues in the workplace. Groupware applications for a collaboratory will have to be selected and implemented with a clear understanding of the social and political concerns that characterize joint scientific work. Among these are issues of authorship, acknowledgement of contributions, esteem of peers, and recognition by professional role models. Without such characteristics, they will not find acceptance!

COLLABORATION TYPES
    We can learn much about the process of scientific collaboration from discussions with scientists themselves. As part of the development of DOE's Environmental Molecular Sciences Laboratory project, researchers were asked about the nature of their current and future collaborations in order to understand what types of communications an electronic collaborative environment must support. Scientific collaborations span a wide range in terms of group size, collaboration style, and focus (experimental, theoretical, computational). The focus at EMSL is basic scientific research, undertaken by as many as 300 researchers.
Distributed Collaboratory Experiment Environments
    The US Department of Energy has initiated a series of four major collaboratory projects known as the Distributed Collaboratory Experiment Environments Program.
    Argonne National Laboratory and Northeastern University are building and testing LabSpace: A National Electronic Laboratory Infrastructure. They are implementing a shared space with persistence and history in two application testbeds. The first is the Telepresence Electron Microscopy project, which is designed to allow the remote use of the Advanced Analytical Electron Microscope and the Analytical Scanning Electron Microscope. The second involves a collaboration with CERN, the European high-energy physics center, which will exercise the LabSpace version of a collaboratory in a larger collaboration over an international data link.
    Lawrence Livermore National Laboratory, Oak Ridge National Laboratory, the Princeton Plasma Physics Laboratory, and General Atomics are pursuing the Distributed Computing Testbed for a Remote Experimental Environment. The experiment focuses on fusion energy R&D, an archetype of research that must be carried out at a few large central facilities with national and international participation. The experiments also are designed for steady-state operation, for which interactive, real-time experimentation becomes important. Remote operations will be conducted at the D-IIID tokamak fusion facility, a fusion research instrument located at General Atomics, San Diego, California. This project demands not only real-time synchronization and exchange of data among multiple computer networks, but also the presentation of sufficient auditory and visual information associated with the control-room environment so that remote staff at multiple sites can be fully integrated in operations.
    Pacific Northwest National Laboratory is working on Collaboratory Development in the Environmental and Molecular Sciences. The Core testbed (described in the main text) is based on instrumentation being developed for the Environmental Molecular Sciences Laboratory project at PNNL. This includes two unique nuclear magnetic resonance spectrometers, which are large, highly shared items, and some small instruments used by a limited number of researchers in molecular-beam reaction dynamics. Thus the characteristics of two related yet distinct scientific cultures, working with two quite different kinds of machines, are being examined.
    The University of Wisconsin-Milwaukee will remotely operate a sophisticated synchrotron-radiation beamline in The Spectro Microscopy Laboratory at the Advanced Light Source project. This collaboratory development project will provide remote access to three analytical tools at the Advanced Light Source located at Lawrence Berkeley National Laboratory that provide spatially resolved chemical information at length scales ranging from one micron down to the atomic scale. The collaboration that uses these instruments is fairly large and geographically distributed, with investigators from nine institutions, so the potential for saving the time and expense of training, staffing, and travel is considerable. The ongoing growth trend of synchrotron-radiation applications will provide a large and welcoming audience for the results. One particularly interesting target audience for remote usage of this and similar facilities is the semiconductor industry, which has a critical need for sample inspection.

    For more information on these projects, see http://www-itg.lbl.gov/~jtcjew/DCEE_Overview.html.


    On the basis of this feedback, we identified four broad categories:

    It is important to note that although we present these classifications and examples as distinct types, a single collaboration may actually contain elements from several styles, either in parallel or as the collaboration evolves. Nevertheless, these categories do help to show the varying communications needs researchers have as they work in different modes and how an individual`s needs may change as the task or nature of the collaboration changes. The fact that researchers may switch collaboration styles frequently as they work through various tasks in an experiment implies that an electronic collaboratory environment should not impose a particular mode. It should instead provide a wide range of capabilities that can be quickly and easily selected and configured for the task at hand. Such flexibility addresses some of the social barriers inhibiting collaboration.

FIGURE 1. Tools provide varying functionality. Some are synchronous, while others are asynchronous. Some work well for more static applications, while some are inherently dynamic.

COLLABORATION TECHNOLOGY
    Electronic collaboration must occur in an environment that lets collaborators work intimately with one another.
    Current implementations use an integrated set of cross-platform tools such as electronic notebooks, video-conferencing systems, electronic whiteboards, shared screens, information-access tools, and instrument-control tools. Figure 1 illustrates how different tools provide varying functionality in interactions depending upon the static or dynamic nature of the information exchange as well as upon the synchronous or asynchronous nature of the session.
    Electronic mail supports collaboration via a time serial dialog. Videoconferencing supports real-time discussion and, with the addition of graphics and whiteboard capabilities, presentation and brainstorming.
    Because the collaboratory concept brings all the scientific resources used by researchers into the mix, both real-time work and asynchronous collaboration are possible. The effect of having all scientific resources available to all researchers moves a remote collaborator from the role of part-time consultant to co-worker.
    Full support for this vision requires substantial additional work, but progress is being made on prototype systems and on the robust, secure, scaleable architecture required for production systems. Network infrastructure is vital for supporting collaboratory-style interaction and linking the high-performance computing system, experimental equipment, data-acquisition systems, and the scientist's desktop workstation into a unified research tool.
    As part of the DCEE program, the EMSL has developed Core, a prototype collaboratory that provides a loosely integrated set of Internet capabilities that appear as extensions to the Web. Core provides a one-click method to start or join multi-tool collaborative sessions from a Web page. The Core Session Manager and Desktop Executive launch and track active sessions, participants, and tools, letting users pick capabilities appropriate for their work without having to be aware of the connection syntax of individual tools, port numbers, firewalls, or Internet addresses.
    Core and the tools it uses are under development at PNNL and other national laboratories and universities. The tools have been chosen because they provide a wide range of cross-platform functions that let researchers interact with remote colleagues in a rich, in-process, style. The tools include:


    These tools must be integrated into a user-friendly, environment cognizant of users' psychosocial needs. Emerging technologies will quickly drive such enhancements as common security, session management, communications programming interfaces, object-oriented scientific data models, and models of the experimental process. Emerging standards for videoconferencing and whiteboards, and cross-platform languages such as Java, will also contribute to the creation of highly integrated and highly extensible collaboration environments.
    At that point, new two or three dimensional interfaces to environments can be developed. A laboratory notebook may contain not only notes and drawings, but instrument controls, real-time data graphs, and videoconferencing windows. An immersive, virtual building may let users see and hear each other, with persistent whiteboards for group notes, and shared simulations and virtual instruments set up in various laboratories. Figure 3 shows a collaboratory software environment and its tools.

FIGURE 3. A collaboratory software environment uses software tools such as whiteboards, electronic notebooks, chat boxes, televiewers, information browsers, and videoconferencing to facilitate effective interactions between dispersed scientists. This present tools set, a realization using today's applications, is rapidly evolving as more is being learned about facilitating the collaborative process.

BARRIERS TO ADOPTION
    The barriers to implementing these environments are both technical and sociological. Existing tools are often immature, unintegrated, hard to support, and costly to maintain.
    The adage "build it and they will come" is disproved daily in the computing industry. A technology often exists without being used because it is perceived as adding little or no value. While we contend that the time is right for a collaboratory solution to the needs of scientific interaction, we are challenged to make it a viable necessity for scientific progress, as well as psychosocially acceptable.
    We can make glowing predictions about the value collaboratories bring to scientific inquiry; we can make equally valid projections of why they will fail. We know how difficult it is to see the future clearly. We remember what the father of radio, Lee De Forest, said about television in 1926: "While theoretically and technically television may be feasible, commercially and financially I consider it an impossibility, a development of which we need waste little time dreaming."
    Videoconferencing has been slow to catch on partly because of its cost, hardware restrictions, lack of standards, and poor audio and video quality.8 The perceived benefit of videoconferencing is not sufficient to overcome the problems of using available systems.9
    Mbone-based freeware applications, which we use in several DOE projects for videoconferencing, originated only in 1992.10 Mbone is now used extensively by a small class of Internet users for videoconferencing and broadcasts. The Unix-based Mbone video applications provide frame rates of only a few per second, while consuming about 200 Kbytes/sec of network bandwidth. Mbone is now providing one-way videoconferencing connectivity to Mac and PC platforms. Cross-platform whiteboards, electronic notebooks, and shared screens use less bandwidth than video but remain in an early state of development, especially with regard to interoperability.
    Collaboratory tools need several more years of research until they will be mature enough to be acceptable to end users.

WE PREDICT THAT COLLABORATORIES will be part of our future, that they will be rich telepresent environments, and that virtual laboratories will proliferate. We also conjecture that complexity will drive the need for increased collaboration in scientific endeavors and new research funding models. As educators, we will be faced with training our students to work in a multidisciplinary world of complex problems, which will cause us to adopt new educational strategies. The collaboratory concept is a qualitatively different way of using communication and information technologies. It has the potential to remove the walls around departments and organizations, and it will lead to the creation of a metalaboratory with capabilities that far exceed those available in any single laboratory. In the next few years, a growing community of scientists from multiple universities and national laboratories will be conducting serious scientific research in cyberspace.

Acknowledgements


The work described in this article was partially supported by West Virginia University, The University of Virginia, and the Laboratory Directed Research and Development program at Pacific Northwest National Laboratory. PNNL is a multiprogram national laboratory operated by Battelle Memorial Institute for the U.S. Department of Energy under contract DE-AC06-76RLO 1830. Many individuals have contributed to the PNNL collaboratory effort, including Paul Keller, Gina Najera, Deborah Payne, John Price, Ian Roberts, Anne Schur, and Jim Wise. We gratefully acknowledge contributions by many individuals to this work. For more information see http://www.wvu.edu/~research/.

References


  1. V.G. Cerf et al., National Collaboratories: Applying Information Technologies for Scientific Research, National Academy Press: Washington, D.C., 1993.
  2. P.S. Malm, "The unOfficial Yellow Pages of CSCW, Groupware, Prototypes, and Projects," in Classification of Cooperative Systems from a Technological Perspective, Groupware in Local Government Administration, doctoral disseration, University of Tromso, Norway, 1994.
  3. R.T. Kouzes, "Creating the Cyberspace Laboratory," in The World & I, The Washington Times Corp., Washington, D.C., 1995, pp. 190-197.
  4. R. Pool et al., "Beyond Databases and E-Mail," Science, Aug. 1993, pp. 841-872.
  5. R.T. Kouzes, "The Collaboratory: Creating R&D Laboratories Without Walls," in Electronic Laboratory Notebooks and Collaborative Computing in R&D: Social, Legal, Regulatory, and Technology Issues, Rich Lysakowski, ed., Team Science Publishing, Sudbury, Mass., 1996, to appear.
  6. B.R. Schatz, "Building an Electronic Community System," J. Management & Information Systems, 1992.
  7. G. Levy, " The Gods Must Be Crazy," American Laboratory, Aug. 1995, pp. 6-8.
  8. E. Garland and D. Rowell, "Face-to-Face Collaboration," Byte,Nov. 1994, pp. 233-242.
  9. W. Machrone, "Seeing Is Almost Believing," PC Magazine, June 14, 1994, pp 233-251.
  10. H. Eriksson, "MBONE: The Multicast Backbone," Comm. of ACM, Aug. 1994, pp. 54-60.

Richard T. Kouzes is the director of program development for science and engineering, and professor of physics, at West Virginia University. He has worked in the field of collaborative computing and is a principal investigator on a DOE Distributed Collaborative Experiment Environment project. He is the originator of the PNNL collaboratory effort. He has been a leader in the field of computerized data acquisition and was a founder and past chair of the IEEE Committee for Computer Applications in Nuclear and Plasma Sciences. Kouzes received a BS in physics from Michigan State University and an MS and a PhD in physics from Princeton University.

James D. Myers is a senior research scientist in the Computing and Information Sciences department of the Environmental Molecular Sciences Laboratory at PNNL, the EMSL Collaboratory project leader and a principal investigator on a DOE Distributed Collaborative Experiment Environment project, working to design, develop, deploy, and understand the use of the EMSL Collaborative Research Environment software. He is the recipient of a 1996 Associated Western Universities Distinguished Lectureship for his work in promoting the collaboratory concept for scientific research, undergraduate research and education, and K-12 education. Myers received a BA in physics from Cornell University and a PhD in chemistry from the University of California, Berkeley.

William A. Wulf is the AT&T professor of engineering and applied science at the University of Virginia, where he is revising the undergraduate computer science curriculum, researching computer architecture and computer security, and assisting scholars in the humanities to exploit information technology. Wulf chairs the Computer Science and Telecommunications Board of the National Research Council and is the interim president of the National Academy of Engineering.

Contact Kouzes at rkouzes@wvnvm.wvnet.edu; Myers at jim.myers@pnl.gov; and Wulf at wulf@viper.cs.virginia.edu.