(2) The Evergreen State College, Olympia, WA 98502, USA
This paper has been submitted to the IEEE Fifth Workshops on Enabling Technologies: Infrastructure for Collabative Enterprises to be held at Stanford University, California on 19-21 June 1996.
Pacific Northwest National Laboratory is a multiprogram national laboratory operated by Battelle Memorial Institute for the U.S. Department of Energy under Contract DE-AC06-76RLO 1830.
The laboratory notebook is a vital tool in scientific research. It is the central repository of information about the reasoning and preparation behind experiments, about the analyses done to obtain results, and about plans for future research. The notebook captures the scientific process that gives meaning to a scientist's observations. Sharing a notebook can help collaborating researchers to build a common understanding of their work. Unfortunately, at modern research facilities, where collaborations may span the globe, providing remote researchers with access to a laboratory notebook becomes a difficult task.
Traditionally, the history of a series of experiments and the knowledge developed by individual researchers have been captured in individual paper notebooks for their own use, or for use by a few local colleagues involved in the experiment. The notebook aids creation of new knowledge by providing a central repository documenting the motivation for experiments, the organization of resulting data and its analyses, as well as insights, thoughts on future directions, and other vital information. The laboratory notebook is the primary record from which a detailed understanding of the direction, purpose, and significance of a research program can be derived.
In modern DOE research facilities, such as the Environmental Molecular Sciences Laboratory (EMSL) at the Pacific Northwest National Laboratory, the questions researchers seek to answer are complex and interdisciplinary. Thus their research is inherently collaborative. The research teams created to provide the necessary expertise to address these complex questions may be distributed between laboratory buildings, across the country, and/or across continents. Creating, maintaining, and accessing the group's information repository using today's tools is a major challenge for such teams. Yet such a repository is vital for close, efficient, and successful collaboration. An electronic laboratory notebook (ELN) that combines the flexibility and familiarity of the paper notebook with the organizational power of computer databases and the distributive capabilities of computer networks, would provide a general solution to the challenges of recording and treating collaborative groups' knowledge. In addition to distributed access, which benefits the group, an ELN can provide automated data entry, searching, and other information processing capabilities impossible with a paper notebook that add value for the individual researcher and research team alike.
ELNs developed to date have not been sufficiently powerful or user friendly for widespread use. The paper by Sonia Sachs [1] on "Electronic Notebooks for Distributed Collaboratories: envisioned research and development," describes why development of ELNs is so difficult. She details the wide range of annotation types stored in conventional notebooks, from text and equations to plots and pictures, while noting that an ELN should work across the myriad of common computer systems. Early experimentation with commercial electronic laboratory notebooks within the EMSL can confirm the importance of supporting many types of notebook entries. The emergence of the WWW as an effective way to provide access to multimedia information in an uniform and standardized manner has greatly simplified this aspect of developing an ELN. WWW browsers provide an extensible canvas upon which text, pictures, audio, and other objects can be displayed and activated. However, the WWW and the hyper-text transfer protocol (HTTP) it relies do not provide much interactivity. The limitations range from requiring a reload of an entire page whenever any information is updated to the inability of the server to alert a browser when new information is available. Fortunately, many extensions, including frames, magic cookies, new HTML tags [2], and especially the Java programming language [3], are making it possible to circumvent some of the restrictions of the original protocols.
We believe it is now possible to develop an interactive ELN using the WWW, and to make such a system flexible enough to work in a variety of research situations. In this paper, we describe our progress in creating a prototype ELN that provides much of the annotation type richness of a paper notebook in a secure, distributed form. Beyond just providing remote interactive access to notebook information, the prototype also expands upon the capabilities of the paper version by including a query facility and by allowing automated entries to be made directly by data acquisition software. The prototype is continuing to evolve as we receive feedback from potential end users and as WWW, Internet, and object oriented technologies advance.
A. Automated Entries and Information Display
The ELN prototype uses the WWW for it's main user interface. To enter the notebook, users simply go to the appropriate universal resource locator (URL). Information in the notebook is divided into two frames: a main window, and a simple control bar below it. This interface is shown in Fig. 1. To see the list of available notebooks, the user presses the 'Notebooks' button. Information is nominally stored in the notebook in a hierarchical collection of notebooks holding experiment folders, although folders from across several notebooks could be collected as a result of a query. The user navigates the information through standard hypertext links and 'Go back' buttons.
Figure 1. The ELN prototype's home page.
The experiment folder level page is shown in Fig. 2. Folders can be automatically generated by data acquisition programs. This software provides the standard datafile, including data and information about the data - instrument settings, operator comments, etc to the notebook. The notebook parses the datafile, producing a table of the metadata, and a live Java X,Y graph of the data that allows the user to zoom in, read coordinates, etcThe full data file is available via an HTML link. When a user clicks the link, data is downloaded to their browser tagged with a custom multipurpose internet mail extension (MIME) type [4]. This allows the user to set up an analysis application as a viewer for this MIME type and have the data loaded directly into the application from the browser.
Figure 2. A single experiment with annotations. Experiment information is automatically sent to the notebook from the data acquisition program. Users can retrieve the full datafile by clicking on the image, or add text, image, or file annotations by clicking the 'Add' button.
B. Interactive Annotation
Below the experiment information provided directly by the data acquisition application is a series of user annotations. At present, new annotations can only be created at the end of the list. This will be changed in later versions to allow new comments to be inserted near specific existing annotations, emulating the process of writing in the margins of a paper notebook. To add an annotation, the user clicks the 'Add' button, which brings up the annotation page as shown in Fig. 3. Three generic types of annotation are provided: text, image, and file. Together, they provide much of the richness of paper, pen, and tape.

Figure 3. The ELN annotation page. Users can add a text or HTML annotation, capture an image from the screen, or upload a file to be linked to the page.
Text entries my be plain text, or HTML. Using the latest HTML tags, researchers can create colored, bold, and italicized text, subscripts and superscripts, tables, etc. [2] Links to other information can also be entered using HTML. In the prototype ELN, text entry is supported a standard text entry field in an HTML form. There is no specific support for generating or editing HTML text within the field itself. For complex HTML input, users can currently cut and paste from any third party HTML editor into the notebook text input field. However, Netscape is already providing integrated HTML editing capabilities in it's Navigator Gold product, and further integration and simplification of HTML editing within WWW browsers via the Java programming language seems likely. We expect to integrate one or more of the mechanisms with the ELN prototype in the future, making the input of formatted text and tables into the notebook easier.
Images can be captured directly from the user's computer display. The ELN includes a helper application that is launched by clicking the 'Capture' button. The capture application allows the user to select an arbitrary screen rectangle which is captured, converted to a local GIF file, and uploaded to the notebook via the browser. The communication mechanisms employed to accomplish this are described in the next section. Direct screen capture allows the user to input a wide range of information into the notebook, limited only by the available computer applications, in a manner similar to taping loose paper into a notebook. Graphs, visualizations, and schematics designed in other applications can be recorded via this mechanism. Information that would have been entered freehand into a paper notebook - formulas, tables, sketches, drawings - can be created in word processors, spreadsheets, and paint programs, and captured for inclusion in the ELN.
Support for including arbitrary files in the notebook extends the capabilities of the ELN beyond those of paper, allowing native files from analysis programs, presentation graphics software, the standard business applications noted above, etc. to be linked into the ELN and retrieved later for modification. This also allows inclusion of new annotation types, such as audio and video into the notebook. These files can be played back with a single click, again using the WWW browser's MIME file handling mechanism to invoke the proper media player.
User's can also combine text, image, and file to create a crude 'annotation object'. Creating an elegant way to provide both an image, and the original information in a form that can still be easily manipulated is one goal of object oriented programming and standards such as Object Linking and Embedding (OLE) [5] and OpenDoc [6]. It is also part of the promise of the Java programming language. As these technologies mature, we plan to incorporate them into future versions of the ELN.
C. Experiment Query Facility
In addition to viewing experiments arranged in predefined notebooks, users can query on specific aspects of the experiment - sample compound, the researcher's name, experiment date - and create a dynamically generated notebook containing only the selected experiments. Thus a notebook with only "John's experiments on acetone from last month" can be created and viewed without having to manually search through experiments done on other molecules during that time.
The ELN prototype relies on the WWW to provide distributed, secure, collaborative access to the notebook contents across the Internet, independent of the user's computer platform. The WWW provides a rich, dynamic, and extensible mechanism for distributing formatted information. Security, at the level of password protection of information, is also provided by the WWW via access control lists [7]. No other existing system provides such a collection of features with such promise for future enhancement.
The communications mechanism used by the ELN and it's relationship to other parts of the EMSL infrastructure are shown in Fig. 4. The ELN pages displayed in the user's browser are dynamically generated using the WWW's common gateway interface (CGI) mechanism [8]. CGI allows the WWW to pass parameters that have been coded into hypertext links, or parameters entered by users in forms, to external processes. These processes can then generate HTML (or other) documents to return through the WWW to the user. In the ELN prototype, CGI scripts are used to format information from the database for the user, to implement the query facility, and to accept user input into the notebook.
At present, the ELN database is a collection of flat files arranged in a hierarchy of notebook and folder directories. An object oriented database for the ELN is under development. The eventual plan for the ELN is to allow it to query the EMSL's database/archive system directly for information about experiments rather than creating a local copy. Until then, the data acquisition software chosen to demonstrate automated entries has been modified to ftp a copy of saved data to the notebook area where it can be read by the CGI scripts.
Figure 4. Communications pathways in the ELN prototype.
As mentioned above, interactive annotations rely on HTML forms and CGI script processing. This directly handles both text/HTML input, and, with the new file upload HTML tag implemented by Netscape, arbitrary files. Image capture is a bit more complex. Our approach incorporates the image capture routine into a WWW browser "helper application". This allows the capture routine to be launched in response to a button press on the 'Add Annotation' form. The button, with type submit, invokes an ELN CGI script which returns a custom MIME typed file. Once launched, the capture routine presents the user with a crosshair cursor to click and drag over the rectangle to be captured. The image, captured as a GIF file, is then uploaded through the WWW along with any text oand/or file the user submits. (The text box, file upload field, and the add annotation button are all part of a second form on the same WWW page as the single button image capture form). An alternate method for uploading the image file is to use a browser communication mechanism, such as Netscape's platform dependent client interfaces (NCAPI), allowing the capture helper application to programatically generate a request to upload the new image file and communicate it to the browser. In either case, when the WWW server receives the request from the browser, it is handled using a CGI script, exactly as if the user had entered the filename in a form displayed in the browser. Thesemechanisms, while complex behind the scenes, appear to the user as a simple WWW based image capture facility that can be used to import the visual output of any computer application.
The EMSL prototype ELN is the first step toward a secure, interactive, distributed, multimedia, WWW based laboratory notebook coupled to the primary EMSL database archive system that can support the needs of EMSL researchers and their external colleagues to develop and share a common repository of scientific knowledge. The prototype demonstrates the concept and promise of an ELN, and provides a platform for further investigation and development of ELN technology. Researchers working collaboratively on an EMSL ion trap mass spectrometer and other instruments have agreed to test the prototype ELN and provide feedback on it's usability and suitability.
There are certainly limitations in the present system that need to be removed in future versions. The inclusion of more Java applets designed to allow annotations, such as freehand drawings, to be created within the browser, and integration with the EMSL database/archive system should both improve the ELN, allowing more natural interaction with the notebook and more powerful organization and querying capabilities respectively. These directions are the subject of a recently submitted proposal.
Despite the room for improvements, the current ELN prototype is surprisingly capable. It provides remote users with up-to-date information on the current state of a research project with little effort on the part of the local researcher. Data is automatically entered into the notebook by the data acquisition software and is immediately available, along with any annotations, to remote colleagues, as well as to the local researcher back in his/her office. No special steps are required to 'send' information to colleagues; everyone with permission immediately has the same information. New annotation types, including audio and video, as well as analysis and presentation files, can be included in the ELN, while querying reduces the time spent searching for the desired data in the notebook. Both of these capabilities enhance the usability of the ELN prototype relative to a paper notebook for both the individual scientist and the research team. The power of the EMSL ELN prototype, coupled with the continuing rapid advances in WWW, internet, and object oriented technologies, strongly suggest that electronic laboratory notebooks will soon be ready to take on an important role in the work of distributed research teams.
[1] Sonia R. Sachs, "Electronic Notebooks For Distributed Collaboratories: envisioned research and development," Lawrence Berkeley National Laboratory Internal Report, November 1995. http://www-itg.lbl.gov/~ssachs/notebook/notebook.vision.html
[2] "HyperText Markup Language (HTML)," World Wide Web Consortium (W3C), http://www.w3.org/pub/WWW/MarkUp/
[3] "Java: Programming for the Internet," Sun Microsystems Incorporated, http://java.sun.com
[4] N. Borenstein and N. Freed, "MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies," 1993. http://ds.internic.net/rfc/rfc1521.txt
[5] "OLE Programmer's Reference 2nd Edition," (2 vols) - Draft, Microsoft Corporation, ftp://ftp.microsoft.com/developr/drg/OLE-info/OLE-docs/
[6] "OpenDoc," Component Integration Laboratories, Apple Computer Inc., and IBM Corporation. http://www.cilabs.org/opendoc.html, http://opendoc.apple.com, http://www.software.ibm.com/clubopendoc/
[7] "W3C Security Resources," World Wide Web Consortium (W3C), http://www.w3.org/pub/WWW/Security/
[8] "CGI: Common Gateway Interface," World Wide Web Consortium (W3C), http://www.w3.org/pub/WWW/CGI/
This work was supported by the U. S. Department of Energy through the Distributed Collaboratory Experiment Environments (DCEE) program sponsored by the Mathematical, Information and Computational Sciences Division of the Office of Energy Research, and through the Laboratory Directed Research and Development program at Pacific Northwest National Laboratory (PNNL). PNNL is a multiprogram national laboratory operated by Battelle Memorial Institute for the U.S. Department of Energy under Contract DE-AC06-76RLO 1830.