Collaboratories: Scientists Working Together Apart
Richard T. Kouzes, Environmental Molecular Sciences Laboratory, Pacific Northwest Laboratory, PO Box 999, MS K1-87, Richland, WA 99352, (509)375-6455, (509)375-6631 fax, rt_kouzes@pnl.gov,
FIGURE 1: Borromean rings show three symmetric interlocking rings, no two of which are interlinked, yet removing one destroys the synergism, representing the symbiotic nature of a Collaboratory.
Abstract
A Collaboratory is an open meta-laboratory that spans multiple geographical areas with collaborators interacting via electronic means - "working together apart." Collaboratories are designed to enable close ties between scientists in a given research area, to promote collaborations involving scientists in diverse areas, to accelerate the development and dissemination of basic knowledge, and to minimize the time-lag between discovery and application.
The Collaboratory Paradigm
Cyberscientists are combining software tools and high speed computer networks with cognizance of the sociology of science and human nature to create an electronic meeting place for interaction among scientific team members.
Introduction to Collaboratories
She had spent many long days in her laboratory at Princeton University, and in other places in her virtual laboratory space, along with her students doing all of the preparatory work. That morning she had met with her collaborators from around the country in a virtual conference to make the final plans for the experimental sequence which would hopefully answer some burning issues surrounding bioremediation of hazardous chemical waste. The experiment would begin shortly. She would be using the most advanced NMR spectrometer available, a one giga-Hertz instrument located at the U.S. Department of Energy's Environmental Molecular Sciences Laboratory in Richland, Washington. She looked out of her window as the sun was setting over the Princeton Graduate College, then sat down at her computer, entering her virtual laboratory to begin collaborating with her colleagues a continent away. The sample they had prepared had been sent out the week before and was in place in the 23.5 tesla magnet in Richland. Checking the settings for the NMR on her computer monitor, together they prepared to carry out the experiment...
Science is a complex intertwining of creativity, discovery and interpretation which builds a body of truth from fragments of knowledge learned through the research process. Collaboration is at the heart of science. The renowned scientist Sir Isaac Newton said "If I have seen further, it is by standing on the shoulders of giants." The collaborations found in scientific research are carried out with a tradition spanning many decades, if not centuries. Science has benefitted greatly from the computing revolution of the last four decades. However, while technology has made science as we know it possible, it has not yet impacted on the collaboration process itself to the degree we will see in the next ten years. Science is not alone in this - computing and communications technology promise to change the way we perform all the tasks of our human enterprise.
Scientific collaborations may consist of three people working together in the field of molecular chemistry or 300 in the case of high energy physics. Collaborations of the present rely heavily on face-to-face interactions, group meetings, individual reflection, and hands-on experimentation. This is a changing landscape. The collaborations which make the scientific process effective are now gaining from the computing tools being created in the field of computer supported cooperative work. Collaborations occur through time, such as that alluded to by Newton, building knowledge upon a foundation of past work. Collaborations also occur through the multiplicitive factor of direct interaction among teams of individuals. The future promises the ability for teams of the best scientists to conduct collaborative research using the newest instruments and computing resources through immersive electronic interaction from afar - virtual laboratory spaces. A new paradigm for intimate collaboration between scientists is thus emerging, accelerating the development and dissemination of basic knowledge, and optimizing the utilization of research instruments, while minimizing the time-lag between discovery and application. The graduate student who today travels across the country to access a unique resource, or more likely does not gain such access due to the cost of travel, will be able to learn and contribute during that most creative time in their scientific life.
When Professor William Wulf of the University of Virginia coined the word "Collaboratory" as a merger of the words "collaboration" and "laboratory" in 1989, he defined it as a "...'center without walls,' in which the nations researchers can perform their research without regard to geographical location - interacting with colleagues, accessing instrumentation, sharing data and computational resources, and accessing information in digital libraries."
We have a future vision of computer enabled tele-present cooperative work which is far from the present state, yet we are beginning to take steps to reach our vision. Much of the implementation of Collaboratories hinges on the existence of the Internet, which is an integration of many electronic computer networks in government laboratories, universities, and industry allowing computers anywhere in the world to communicate and share information. The extension of the Internet to the Information Superhighway, now only a two lane dirt road, will profoundly impact the evolution of science and commerce in the United States and the world. The impact of electronic mail (email) is the first inkling of the communications revolution which will effect the way in which we do science. We are also witnessing the beginnings of electronic publication which will eventually displace the traditional paper journals as the major mechanism for scientific communications.
FIGURE 2: Collaboratories tie researchers from disparate locations together to enable more effective collaborative science.
Collaboratory Cornerstones
The realization of a true Collaboratory is still a vision, yet baby steps have been taken. Immersive data exploration tools with cognizance of human psychological needs have been developed, allowing a researcher the ability to cruise through a multidimensional data set to gain a greater understanding of its texture and inter-relationships. Molecular biologists are pooling their knowledge of gene sequences and gene maps by establishing and maintaining large databases. Space physicists and oceanographers share their data. Worms are studied through a world-wide network.
The Worm Community System (WCS) is an advanced example of a data driven collaboration. It allows searching of the literature, including journals, newsletters, informal notes, and the data of researchers studying the nematode C. elegans, which is a harmless worm of little significance to humans, commonly found in the soil. These capabilities elevate the WCS from a simple tool for sharing data to an electronic forum that also allows sharing insights generated by the data.
Experimentation using remote instruments is a common driver for development of collaborations. Early work in remote experimentation has focused on very large instruments such as telescopes, accelerators, oceanographic instrument and space applications. More recently, smaller instruments such as electron microscopes and scanning tunneling microscopes have been enabled for remote access. These latter applications are more typical of many areas in physics and chemistry.
The Upper Atmospheric Research Collaboratory (UARC) provides access for half a dozen institutions to instruments sited in Greenland for observation of the solar wind. The UARC collaborators exchange and archive multi-media information from the instruments and their analysis of the measurements.
There have also been significant advances in applying collaborative tools for use in the area of medicine. A statewide implementations of tele-medicine has been made in Iowa. The military have invested heavily in tele-medicine technology and have deployed systems to such locations as Somalia and Haiti. Tele-medicine provides for remote diagnosis and consultation between surgeons, and opens the possibility of remote surgery. The military are striving to have the capability of delivering the best surgical skill to the battlefield during the "golden hour" after trauma through remotely controlled robotics. To succeed, the surgeon will need immersive visual, tactile, acoustic and olfactory feedback - a technological challenge presently being tackled. The National Information Infrastructure Testbed (NIIT), a consortium of over 55 companies, universities and government agencies, recently demonstrated a complex national tele-medicine prototype implementation for the U.S. Congress.
A recent incarnation of a collaborative environment for science is the BioMOO at the Weizmann Institute in Israel (URL http://bioinfo.weizmann.ac.il:8888). The BioMOO, now over a year old, exists only in cyberspace as a meeting place for biological researchers. The MOO (Multiple user dungeon, Object Oriented) is an implementation of a MUD (Multiple User Dungeon or Multiple User Dialogue), a remotely accessed computer program for exploration, where users navigate via text commands in much the same way as older computer games. While navigating through the BioMOO, one encounters bulletin boards, seminars, and other individuals. This virtual meeting place provides a new interaction paradigm for science. Graphical interface versions of scientific MOOs cannot be far away.
A Collaboratory for the Environmental Molecular Sciences
The Environmental Molecular Sciences Laboratory (EMSL) at the U.S. Department of Energy's Pacific Northwest Laboratory (PNL) in Richland, Washington, has the mission of bringing together experts from many scientific disciplines to help solve the nation's environmental problems. The task of environmental remediation facing our nation is enormous, and since the problems are so complex, solutions will require the integration of work in many fields. Presently, research in the areas of chemistry, physics and computer science which are needed to clean up the environmental legacy of the last century is pursued at many institutions by many small groups with infrequent exchanges of ideas and results. It is often a reality that results from one scientific domain remain unknown to another for decades.
While the EMSL is a physical laboratory, we intend to evolve it into a laboratory without walls, a virtual place which will span the nation as a Collaboratory for environmental research. A Collaboratory approach within the EMSL promises significant advantages over the current, often uncoordinated way of conducting scientific research. By enabling the process of electronically bringing together researchers in a multi-disciplinary approach to the environment, the Collaboratory will allow a coordinated attack which promises rapid progress in resolving the complex environmental challenges we face.
FIGURE 3: An example of the types of instrument found in environmental molecular science research laboratories, this Fourier transform ion cyclotron resonance mass spectrometer can make extremely high precision measurements of molecular masses.
The goal of the Collaboratory is to increase the efficiency of research and reduce the time required to implement new environmental remediation and preservation technologies. This new approach will decrease the costs of current projects and allow more complex tasks to be undertaken. The Collaboratory will leverage the intellectual and physical resources of the EMSL by making them more accessible to remote collaborators, as well as by making the resources of remote sites available to local researchers. In short, the Collaboratory will establish and support an electronic community of scientists researching and developing innovative environmental preservation and restoration technologies. To do this, the Collaboratory must meet a number of significant challenges.
Environmental molecular scientists utilize a wide variety of seemingly disparate experimental techniques to understand molecular systems. Instruments range from large nuclear magnetic resonance and mass spectroscopy machines to bench scale devices for studying molecular physics and chemistry.
One great challenge for the Collaboratory is to allow scientists and decision makers to share data at a higher, interpreted level, after an analysis to extract relevant information that is independent of the instrument and the experimental technique. A second challenge is to facilitate the movement of ideas from basic research across to remediation applications in the field. The gap between research and application is sometimes referred to as the "valley of death" since funding mechanisms are often not in place to pay for this technology transfer.
For example, there are storage tanks filled with complex mixtures of chemical and radioactive waste, having constituents ranging from solid, to peanut butter consistency, to liquid. The contents of these tanks must be excavated via remote handling and converted into stable solids, such as glass, for permanent, safe storage. The engineer in charge of this cleanup problem does not evaluate the data on the tank contents from the same point of view as the research chemist who has the model for the molecular processes taking place in the tank. Yet these individuals must communicate issues to one another to be assured that the outcome is indeed a stable, fault free containment of the waste products. Basic research provides the knowledge of the chemistry and materials physics. Engineering provides the process for remote handling and glassification. Decision makers determine the implementation path to put the remediation into practice. Facilitating the interactions of these groups will speed up the transfer of basic science out to the field where solutions are required.
Another challenge lies in electronic communications. Researchers will need to speak with each other and discuss their data using images, shared drawing tools, and shared programs. Interactive development of new experimental techniques and theories could occur on an unprecedented scale if scientists can effectively communicate without regard to location. Tele-mentoring (such as electronic teaching) will be required as researchers strive to understand each other's data and methods. The Collaboratory effort requires the development and integration of tools to support these general collaborative activities.

FIGURE 4: A cyberscientist utilizes the Collaboratory Software Environment prototype to interact with a collaborator.
The Building Blocks of a Collaboratory
Computer have become central to science, as they have become pervasive in all our lives. They are our entree to the hidden worlds of the very large and the very small. Computers run our experiments, collect our data, visualize our measurements, simulate the natural process, crank through the equations, digest our speculations, and spit out our answers. Computers are not universally accepted as desirable insertions in the scientific process. Computers can hide from from the scientist's mind the understanding of nature by their manipulation of the raw data gleaned by those mechanical extensions of our senses which are the instruments of science. Computers are tools which confound and befuddle. A true Collaboratory will ultimately be an example of what has been termed "ubiquitous computing" where the computer is made effectively invisible to the user, disappearing from our awareness, rather than being the focus of attention.
With the long term goal of building a true Collaboratory, PNL is taking the first steps to lay a foundation for future development. We are working to meet the challenges through the development of the Collaboratory Software Environment (CSE) application - a single interface to a collection of general collaborative software tools, integrating their functionality in a unique way.
To be successful, the technical systems must be designed in recognition of the inherent social systems for collaboration and research. Providing for the needs of this new sociology of scientific research is essential if the Collaboratory approach to science is to be accepted and used. Psycho-social issues, often overlooked in discussions of collaborations, are of vital importance. Scientists' perception of being part of a group, even when geographically separated, will depend not only on communication technologies, but on the ability to develop working relationships and friendships, to have informal chats, and on all of the other ways people develop a common sense of purpose. Such dispersed groups may also have to be funded and managed in a new manner, and credit for the group's work must accrue to all of the group's members. Building a Collaboratory is, like any other construction project, a social as well as technological endeavor. The very concept of a Collaboratory demands as much innovation in its human aspects as it does in its engineering and scientific ones, particularly when the social controls and communication habits that have characterized our entire social evolution can no longer operate in the accustomed ways.
A collaborative session might be a phone conversation, a video-conference, an intense data analysis session, or an exchange of ideas with an electronic whiteboard. This variety of interaction defines the need for a selection of software tools from which a researcher can choose those appropriate for each situation. The prototype CSE consists of an extensible suite of software tools integrated in a way which allows the user to select those most appropriate to current needs. One important requirement for the CSE is that it be implemented as a cross platform environment, supporting computers from multiple vendors.
Six tools, providing different collaborative functions, are presently available in the prototype CSE:
* A Chat Box for the exchange of text messages among participants.
*Audio and video tele-conferencing.
*A shared electronic whiteboard with a common surface for all participants which can be written on via a mouse, joystick or tablet.
*A tele-viewer, which offers multiple collaborators the ability to simultaneously observe and interact with each other's windows or screens.
*A collaborative electronic scientific notebook which provides a shared, multimedia, permanent data storage mechanism for scientific information.
*Electronic information access via the World Wide Web for comprehensive access to documentation and data.
FIGURE 5: The Collaboratory Software Environment utilizes software tools such
as a whiteboard, an electronic notebook, a chat box, an information window,
a shared program window, and video conferencing to facilitate effective
interactions among dispersed scientists.
Imagine a scenario for two scientists using the Collaboratory Software Environment. A researcher might start a collaborative interaction with a peer by using voice communication which then naturally expands to include video so that they might both see a piece of equipment (using the tele-conferencing tool). The need for them to discuss the operation of the instrument brings the need to look at the instrument's manual (online accessible through the information browsing tool). As the conversation turns to comparing results of theory and experiment, a shared window would be brought up so that there is a common view of the data as seen by a graphical analysis program (with the tele-viewer). The conversation would then evolve into requiring annotation of the data and spontaneous "what if" types of interactions by both colleagues (provided by the whiteboard tool). As the conversation comes to specific conclusions, the researchers store the result of their interaction permanently in the electronic notebook for future access.
This scenerio is realizable today. In the near future we will see much more powerful collaboration options where these two scientists meet in a virtual place represented by characters moving through synthetic meeting rooms and laboratories. They will sit down at a virtual conference table and discuss the experimental results in a natural way, hand gestures and all, writing on whiteboards and pointing to analysis results suspended beside them. This artificial realization of the familiar will provide the sense of place they need to comfortably interact without being inhibited by the technology which enables the interaction.
FIGURE 6: Collaborators located on the same campus or at the other end of
the continent share and interact through the electronic networks and the
Collaboratory Software Environment.
Conclusion
It seems clear that Collaboratories have the potential to greatly benefit the scientific community by expanding the resources available to individual researchers, increasing the efficiency of our research system, and coupling basic to applied research efforts. The current system of scientific communication via completed papers, with occasional conferences and short visits, has been with us since the 17th century. Despite amazing advances in the technology to communicate rapidly and in great detail, there has been little change yet in the paradigm of scientific communication. Electronic mail and electronic publishing are forewarning a change to come. The Collaboratory concept is a qualitatively different way of using communication and information technologies. It has the potential to remove the walls around departments and organizations, and will lead to the creation of a meta-laboratory with capabilities that far exceed those available in any one laboratory alone.
Pacific Northwest Laboratory has embraced the idea of the Collaboratory as a powerful tool to connect researchers across the nation as they create solutions to our environmental problems. Extension of the Collaboratory "laboratory without walls" atmosphere to the environmental molecular sciences community will enhance our nation's understanding of the fundamental molecular aspects of environmental problems, and our ability to apply that knowledge to establish and maintain a safe and clean environment.
... The experiment had run its course as the NMR console faded from her screen, and those of her collaborators. Weeks of data analysis and intense discussion with colleagues and students would follow. Satisfied that another piece of the scientific puzzle was going to fall into place, she left her office in the Jadwin Physics building and trudged across campus towards home. Sidebar 1: The Environmental Molecular Sciences Laboratory
The US Department of Energy's commitment to environmental cleanup at its sites presents significant scientific and technical challenges. These challenges are exemplified by the environmental problems at the Hanford site covering an area of southeastern Washington state more than half the size of Rhode Island, 1450 square kilometers (560 square miles). The Hanford site has approximately 1.4 cubic kilometers of hazardous and radioactive wastes, 388 square kilometers (150 square miles) of contaminated aquifer, 227,000 cubic meters (60 million gallons) of radioactive wastes (260 MCi) stored in underground storage tanks (of which more than one-third are believed to be leaking), 270 tons of spent fuel, 9 inactive reactors, and 7 major inactive reprocessing plants. The site is the equivalent of nearly 1400 Superfund sites. Pacific Northwest Laboratory (PNL), the DOE's multi-program national laboratory in Richland, Washington, is tasked with developing innovative, cost-effective technologies to address the environmental challenges at Hanford and across the nation, and to facilitate the application and commercialization of these technologies. The Environmental Molecular Sciences Laboratory (EMSL), a 200,000-square-foot $230 million research facility presently under construction at PNL, is a key to accelerating and ensuring the effectiveness of this cleanup effort.
The EMSL will be a national focus for the environmental and molecular science research communities. The special resources of the EMSL will enable scientists to apply advanced capabilities to research and technology development in areas such as contaminated soils and groundwater; waste analysis, characterization, processing, and storage; and effects on human health and ecology. The EMSL will be part of DOE's high-performance computing network linked to the other national laboratories, and to universities and industrial laboratories, allowing data and information generated in the EMSL to be shared electronically with the national and international scientific communities. As a national collaborative research and technology laboratory, the EMSL will attract hundreds of scientists from academia, industry, and other government laboratories across the United States and around the world. The nature of the EMSL project is inherently collaborative - scientists with expertise in chemistry, materials science, condensed matter physics, molecular biology, and environmental science must work together to address complex environmental problems.

FIGURE 7: The EMSL facility, presently under construction at PNL, will house state-of-the-art instrumentation for environmental molecular science research.
Sidebar 2: WWW and Mosaic
The World Wide Web (WWW), a system for maintaining distributed hypertext documents, originated at the European Center for Nuclear Research (CERN) in Geneva, Switzerland. Hypertext refers to text documents containing hidden pointers to other computer files which are accessed when designated words are selected in the text. This capability enables a reader to browse through a series of interconnected documents which may be located on different computers around the world. The Web is an example of systems being implemented to create the National Information Infrastructure (NII), also known as the Information Superhighway. Initially developed to keep track of researchers' information and to provide an easy method of sharing information among scientists, the Web has grown into one of the world's most widely used environments for information publishing, discovery and retrieval.
Mosaic, developed at the National Center for Supercomputing Applications (NCSA) at the University of Illinois, Urbana-Champagne, is a network information browser, or a World Wide Web client, that allows retrieval of documents across the Internet. As a distributed hypermedia browser designed for information discovery and retrieval, NCSA Mosaic provides a unified interface to diverse protocols, data formats, and information archives used on the Internet. Mosaic helps enable the exploration of a huge and rapidly expanding universe of information through an interface that is based on the idea of hypermedia, where electronic links - known as hyperlinks - are embedded in richly formatted documents that can include full-color images, video and sound. These documents are presented to the users like pages of an interactive, scrollable, on-line book. With the advent of NCSA Mosaic, traffic on the World Wide Web is doubling every six weeks. Mosaic is implemented on UNIX platforms, Macintosh computers, and Microsoft Windows based systems. All three client packages are freeware and can be obtained from NCSA's anonymous FTP site: ftp.ncsa.uiuc.edu.
The Environmental Molecular Sciences Laboratory and the Pacific Northwest Laboratory have made available to users on the Internet information about the laboratory's research capabilities and programs. The initial entry point for information on a WWW server is known as a "home page." The EMSL public home page is at Universal Resource Locator (URL): http:/www.emsl.pnl.gov:2080. Detailed information about the PNL Collaboratory effort, including documents and graphics, can be found on the EMSL Mosaic home page.

FIGURE 8: The Mosaic home page for the Collaboratory at the Environmental Molecular Sciences Laboratory.
Sidebar 3: The Sociology of Collaboration
As the power of information technologies has grown, it has brought humans to the threshold of a strange, new Collaboratory setting. In this synthetic place, distributed across space and time yet maintained through loops of electronic information flow, individuals will convene, converse and cooperate on some of the most challenging scientific problems of the 21st century. The Collaboratory concept is nothing less than the village square and campfire juxtaposed to the Information Age.
The concept is undoubtedly technically feasible. The question is whether it is socially sustainable. Is it possible to electronically create a distributed organization with a suitable `sense of place,' that permits and even enhances the successful cooperation of dispersed individuals toward common goals? Anthropologists have long maintained that it was the need of early humans for collaboration in hunting and foraging activities that drove the development of communications. Now that social equation is to be written in reverse: How can Collaboratory communications be established that permit human cooperation to thrive, even though the evolutionary social mechanisms that have depended on proximity are absent?
The last decade's experience with communication via electronic systems revealed the ubiquity and usefulness of human social controls that go mostly unnoticed in face-to-face contacts. When a technical medium of discourse removes these controls, their role becomes apparent. In this respect, the first language is surely the language of gesture; and every social encounter takes place in the context of this `silent language' of body motions and spatial positions. Body orientation and movement, the interpersonal speaking distance, and making and breaking of eye contact all send silent messages that are just as meaningful as the spoken word. These are the means that maintain social controls on spoken exchanges, and yet they are absent in the electronic medium.
In terms of the written word, email is the great communications leveler. A person who would never think of calling a complete stranger to ask for assistance with a reference or a technical problem will have no compunction in contacting that stranger via email. Email also often omits the obligatory social salutations and closing rituals that mark personal correspondence or even telecommunications. The silent, social controls of personal discourse are uniformly absent in email. `Emoticons' or `smileys', a set of symbol characters to be read sideways as a facial pictogram, have only recently appeared on the scene, as a way to convey feelings in an unfeeling medium, {:-).
Even when the image and sound of a person is restored through audio-video communications, technical limitations can make the exchange much less than satisfying. Improper placement of a pickup camera can make it appear disconcertingly that the speaker is always looking away from the listener. Limited bandwidth in picture transmission can produce `freeze frames' that catch the individual in the middle of a sneeze, yawn, or eye blink. The small size and placement of monitors can promote a `talking heads' impression that socially diminishes the messenger and the message.

FIGURE 9: An understanding of psycho-social issues, such as a sense of place and attention to ritual, are crucial to the success of a scientific Collaboratory.
Acknowledgements
This work was supported by the Laboratory Directed Research and Development program at Pacific Northwest Laboratory (PNL). PNL is a multiprogram national laboratory operated by Battelle Memorial Institute for the U.S. Department of Energy under contract DE-AC06-76RLO 1830. Many individuals are contributing to the Collaboratory effort at PNL, including Jim Myers, Gina Najera, John Price, Bruce Rex, Ian Roberts, Anne Schur and Jim Wise. Further information is available at URL http://www.emsl.pnl.gov:2080/, or via email to rt_kouzes@pnl.gov.
About the Author
Dr. Richard T. Kouzes is a Staff Scientist in the Computer and Information Sciences program of the Environmental Molecular Sciences Laboratory and the Collaboratory and New Initiatives team leader. He is a cyberscientist with a research program in computer supported cooperative work, advanced data acquisition system development, neural network applications and precision atomic mass measurements using FTICR. He is a leader in the field of computerized data acquisition and was a founder and past chair of the IEEE Computer Applications in Nuclear and Plasma Sciences Committee. He is the author of over 60 refereed papers.
Before coming to PNL, Dr. Kouzes was a Senior Research Physicist and Lecturer at Princeton University, where for fifteen years he was a leading researcher in solar neutrino and nuclear structure experimentation. He was a collaborator in two major international solar neutrino experiments, and actively pursued research in nuclear techniques for precision atomic mass measurements. He developed several data acquisition systems for nuclear physics applications.
Dr. Kouzes earned his Ph.D. in physics from Princeton University in 1974.