M/s submitted to Chemistry in Britain.

Please address all correspondence to:

Dr Mark J Winter
Department of Chemistry
The University of Sheffield
Sheffield S3 7HF, UK.

M.Winter@Sheffield.ac.uk

Modification Date: Wednesday, March 29, 1995


Chemistry and the WWW

Henry S Rzepa, Benjamin J Whitaker, and Mark J Winter

Department of Chemistry, Imperial college, London, SW7 2AY, UK
School of Chemistry, University of Leeds, Leeds LS2 9JT, UK
Department of Chemistry, The University of Sheffield, Sheffield S3 7HF, UK


Box 4: Multipurpose Internet Mail Extensions for Chemistry

Multipurpose Internet Mail Extensions, or MIME for short, describe an Internet standard through which non-textual, or more specifically non-ASCII encoded, messages can be posted electronically across the world's computer networks. The MIME standard was introduced by Nathaniel Borenstein and Ned Freed in a series of Internet documents published in the early 90's [5]. Until the acceptance of the MIME mechanism data, such as images, video, audio files and compiled application software had to be first run through a program such as "uuencode" or "binhex" in order to generate an ASCII encoded file which could be sent through the various mail relays to its destination. This was clearly clumsy and time consuming, since the data had to be coded and decoded, generally by hand, before transmission and on receipt.

The solution put forward by Borenstein and Freed overcame these inconveniences without introducing any serious incompatibilities with existing electronic mail standards. At the heart of the mechanism a header field specifies the primary context type and sub-type of the data in body of the message and also indicates how the data should be represented.

As originally formulated the primary data types were "text", which could be used to represent textual information in a number of character sets, i.e. other than US-ASCII; "image", to transmit still picture data; "video" for moving pictures; "audio" for audio or voice data; and "application" for application or binary data. Two other primary types were also specified; "multipart", so that several data types could be sent at one time, and "message", for encapsulating one mail message within another. The authors formulated their proposals in such a way that the mechanism could be easily extended to incorporate other primary content types.

Given the primary content-type and sub-type of the message the receiving mail program can the figure out how to display the data. For example, a message specifying a content-type/subtype of image/gif clearly refers to a still picture encoded using the GIF format. Providing that the receiving mail program knows what to do with a GIF image file it can launch an appropriate application program to view or print the picture. These programs are known as "helper applications". A number of popular mail programs can be configured to implement MIME. Among these are "eudora", "pine", and "elm". Instructions on how to configure these mail programs can be found on the Internet.[6]

The use of MIME as means of attaching audio and video data to electronic mail messages does not seem to have gained much popularity until the development of the World-Wide-Web. The web developers, however, soon realised that image, audio, and video data could be embedded into documents using the MIME mechanism. Our modest contribution to the field came from the realisation that chemical information could also be transmitted very efficiently over the Internet using the same mechanism. We first described these ideas in a Chemical Communication,[7] and have since formulated a draft Internet document describing our proposed standard for the use of MIME to transmit chemical data electronically[7]. The essence and the utility of the idea comes from the recognition that chemical data is intrinsically three-dimensional and that the same information can be interpreted differently in different contexts, that is chemical data is semantically rich. Thus, for example, by transmitting a list of atomic coordinates and a connection table it is possible to "reconstruct" a molecule in a number of different ways depending on the context (Fig. 1). The context is defined by the MIME sub-type. The molecular information can be rendered in any number of different ways - as an animated pseudo-3D model, as the input to a quantum chemistry calculation, etc. - depending on the context and the preferences of the recipient of the data. Eventually we believe that it will be possible to transmit chemical information over the Internet in such a way that a robotic agent at the recieving end will be able to synthesise the material.

As a method for working chemists to communicate effectively over the Internet via electronic mail chemical MIME has obvious advantages, and we believe that the use of chemical MIME types will rapidly gain wide acceptance. A number of data repositories, such as the National Institues of Health, have already adopted its use to search and browse the Brookhaven Protein Databank. New Web applications are appearing almost daily.


References