Integrating temporal media and open hypermediaon the World Wide Web
Niels Olof Bouvin
, Rene´ Schade
 Department of Computer Science, University of Aarhus, Aabogade 34A, DK8200 Aarhus N, Denmark 
Tele Danmark Internet, Olof Palmes Alle´ 36, DK8200 Aarhus N, Denmark 
The World Wide Web has since its beginning provided linking to and from text documents encoded in HTML. The Webhas evolved and most Web browsers now support a rich set of media types either by default or by the use of specialisedcontent handlers, known as plug-ins. The limitations of the Web linking model are well known and they also extend intothe realm of the other media types currently supported by Web browsers. This paper introduces the Mimicry system thatallows authors and readers to link to and from temporal media (video and audio) on the Web. The system is integratedwith the Arakne Environment, an open hypermedia integration aimed at Web augmentation. The links created are storedexternally, allowing for links to and from resources not owned by the (link) author. Based on the experiences a critique israised of the limited APIs supported by plug-ins.
1999 Published by Elsevier Science B.V. All rights reserved.
Temporal media; Open hypermedia; Plug-ins; Web augmentation
1. Introduction
The World Wide Web has since its beginningsteadily embraced more and more types of media.Today the average Web user will be exposed to pic-tures, video clips, sound recordings, music, and willinteract with programs or 3D worlds residing onWeb pages. These types of media are either handledby the Web browser itself or handled by specialisedprograms, ‘viewers’ or ‘plug-ins’. Most media typesare however supported
handled in the sense that itis possible to link to the entire media clip or toinclude it on a Web page, but not link from themedia clip itself or to a segment of the media clip.Pictures are the exception, using image maps, to
Corresponding author.
provide starting points for navigation. However allmedia types (HTML and otherwise) share the limi-tations of in-line unidirectional links
; links cannotsrcinate from documents not owned by the hopefullink creator, and the destinations of a link into aHTML document are limited to the named regionsin the target document. These deficiencies are beingaddressed by integrating open hypermedia systemsand the Web, allowing link structures to be storedexternally of documents. This approach also allowsfor links to and from media types less amenableto modification than HTML, provided that suitableplug-ins or viewers are used.This paper describes an open hypermedia inte-gration to provide linking facilities to and from
With the possible exception of image maps, which may havelinks defined externally.
1999 Published by Elsevier Science B.V. All rights reserved.
temporal media (such as video and audio clips). Asearch for an appropriate plug-in to provide the nec-essary functionality left the authors empty-handed.This resulted in the implementation of the Mimicryplayer which substitutes for a plug-in. Through theMimicry controller, the system interacts with theArakne Environment [4], allowing users to createmultiheaded bi-directional links to and from tempo-ral media clips, embedded or otherwise, as well as toand from HTML documents.The results achieved by the Mimicry system (andthe relative ease of implementing the ideas behindit) raise the question, why plug-in developers donot yet provide the functionality to support such asystem. It would substantially ease the work of Webpage designers, as media clips or parts thereof couldbe reused, as well as supporting new (on the Web)technologies such as linking to and from temporalmedia.The paper begins by introducing related work in the field of open hypermedia and on the Web.The merits of emerging standards such as SMIL,HTML
TIME, and XLink 
XPointer are discussed.The Arakne Environment wherein the Mimicry sys-tem runs is introduced and described. The Mimicrysystem (player and controller) is described in detailand an example of Mimicry usage is given. Based onthe experiences with the Mimicry system, the currentstate of plug-ins is discussed in the context of hyper-media systems. Finally directions for future work arediscussed and a conclusion is reached.
2. Related work
This section will introduce the notion of exter-nally stored link structures, open hypermedia sys-tems, Web augmentation, and the work done withintegration of temporal media and hypermedia sys-tems.
2.1. Separating document and structure
The linking model found in the World Wide Webis based on in-line unidirectional links. While simpleand scalable, this linking model is in some contextsinadequate in comparison with the linking modelfound in most modern hypermedia systems. In-linelinks are hard to maintain
; it is impossible to de-termine which links point to a page
; there can beonly one set of links in a document; a link may pointto only one destination rather than many; and thisdestination is limited to either a whole document ora named region therein. The problems with this ap-proach can be illustrated by the following example.Consider a situation, where a company decides ona Web based Intranet solution to allow easy accessto their technical documentation. The Web, giventhe URL naming scheme, is very well suited fordocument distribution. The technical documents arecrucial to several independent groups in the companyand they all wish to put links into the technical docu-ment. Current Web technology yields two scenarios:either give each group access to copies modifiableby the respective group, or incorporate all links intoone central document. The former makes updatingthe technical documentation difficult, and the lat-ter will clutter the technical document with linksinteresting to one group, but irrelevant to the rest.Both cases leave the technical document open toundesirable modification. A safer and more main-tainable solution would be to have one copy of thedocumentation available and then have each groupuse their own set of links into the documentation.This can not be done with the existing Web linkingmodel.A way to achieve this kind of flexibility is throughthe use of anchor-based hypermedia, which separatesthe anchor (or endpoint) and the link from the doc-ument. An anchor specifies a span in a document(up to the entire document), an area in a picture, asegment of a video clip and so forth. Links specifyrelationships between anchors, as seen in Fig. 1. An-chors and links are stored outside the documents. Ahypermedia application used by a user inserts linksinto documents
as they are retrieved 
(e.g. not at theserver, but at the client). The user can through thehypermedia application decide which sets of links touse, and the company described above would thusbe able to maintain several sets of links to the sametechnical documentation.
In the sense that links may point to documents that do nolonger exist or have been moved.
Save a brute force approach using search engines, which iscomputationally expensive and not necessarily accurate.
377Fig. 1. In-line and anchor-based linking.
There are pros and cons of this approach. As linksand anchors are now separate they can be maintainedseparately and can be checked and updated inde-pendently of the documents they link. Links can bebi-directional, multiheaded and link into documentswithout modifying them. However, the insertion of links into the document on the fly carries some addi-tional overhead, and does generally not scale as wellas the simpler in-line link model.A great benefit of the anchor-based linking ap-proach is that of opaque anchors, that is, the generalsystem is not concerned with how an anchor ad-dresses a selection in a media type. The generalmodel need not be modified if a new anchor type isintroduced to support a new media type, as long asthe new anchor type adheres to the general anchorspecification. The anchoring code (such as the abilityto display an anchor into the media type) must of course still be written, but storage, link following,etc. is unaffected. This allows for complex anchor-ing constructs, and allows the developers to supportnew applications and media types without sacrificingexisting work.Anchors are created to match their media type.They must carry enough information to be able toidentify the selection that the user had in mind whenthe anchor was created. In the case of text anchors,this information often consists of a selection anda context around the selection, so that it may beuniquely identified. An image anchor could (depend-ingon itsuse)be assimpleastwoco-ordinate pairstoidentifyarectangularselection,orbecomplexenoughtoidentifyarbitraryshapesasdestinations.Inthecon-textof temporal media,it isnatural touse time asunitinthesystemofco-ordinates.Framesareanotherpos-sibility but the frame rate of a media clip may varyaccordingly to the bandwidth of the user’s connec-tion. Furthermore some temporal media types, suchas audio, may not have frames at all.
2.2. Open hypermedia and the Web
Open hypermedia systems are characterised bya focus on integration with third-party software.Historically, hypermedia systems have often beenclosed and monolithic, requiring other programs tocomply to the standards of the hypermedia systemin order to provide hypermedia functionality. Thisis problematic as it closes the door on existingnon-compliant applications, and expects developersto change their programs — an unlikely propositionat best. Recognising that most people are not willingto throw away their applications in order to utilisea hypermedia system has led the open hypermediacommunity to integrate existing applications intotheir hypermedia systems instead. The level of whichthis is possible varies according to the applications,ranging from the simple (show document) to theadvanced (a full integration). Whitehead handles theimplications of integrating third-party applicationswith open hypermedia integration in [21].A natural consequence of the open hypermediaapproach is the interest of integrating the Web intoopen hypermedia systems. Several groups in thefield have created Web integration tools, and a non-exhaustive list includes DLS [5,6], DHM
WWW[8], Webvise [9], Navette [3], Chimera [1], and Hy-perWave [18].
A common approach to open hypermedia Webintegration is to modify Web pages while en route tothe Web browser. This modification usually consistsof the addition of links or other kinds of structure.These links are stored on a structure server and areinserted into the pages using CGI-scripts, proxies,or programs controlling the Web browser. The inter-face presented to the user varies, ranging from nointerface at all to full authoring applications allowingthe users to modify and extend upon the existingcollections of links and anchors. Most, however, al-low the user to create links to and from whateverpages the user may desire, thus alleviating one of the major limitations of the existing Web architec-ture mentioned above. For a fuller discussion on thetechniques of open hypermedia Web integration andWeb augmentation in general, see [4].The main target of these integrations has beenWeb pages rather than other kinds of Web-distributedmedia, as HTML is easily analysed and modifiedby use of proxies or other means. Other kinds of media that could be interesting include graphicsand temporal media, streaming or otherwise. Thesedata types are less readily modified, come in greatvariation, and requires viewers or plug-ins that maybe difficult to integrate with an open hypermediasystem.
2.3. Hypermedia and temporal media
Several hypermedia systems have been extendedor devised specifically to handle temporal media.The most influential hypermedia model to incorpo-rate the notion of temporal data is HyTime [7], whichallows for multidimensional anchors (including thetemporal dimension). HyTime is a general standardaimed at interchange and does as such not specifythe interaction between hypermedia application andmultimedia applications. While many systems haveintegrated temporal media one way or another, somesystems go further, such as AEDI [2], which tries toease work with large amounts of temporal data withstructured indexing. Some systems try to facilitateautomatic tracking and location of anchors in mediaclips; two well-known examples are Himotoki [10]and MAVIS [15]. While the possibility of selectingan object in one frame, and have the system auto-matically track the object in the rest of the video clipcertainly is alluring, it is also quite computationallyexpensive, and probably unlikely in a Web setting.The ambitions of the authors are far more modest.We are merely interested in being able to create linksto and from segments of temporal media. Futureversion may include anchors that cover an area of a video clip for some duration, but the area is notexpected to move.
2.4. Temporal media on the Web
The use of temporal media on the Web hassteadily increased. It is today commonplace for newssites to bring either video clips or to provide accessto whole TV shows online. Likewise the researchcommunity (notably the linguistic) has taken to mak-ing animations or sound recordings available. Thisopens for a hitherto unseen availability of valuableresearch material (such as historic film clips or lan-guage recordings), and therefore also an increasingdemand to be able to interact with these media clipsin new and innovative ways.An example of an organisation beginning to putlarge amounts of temporal data on the Web is theDanish national library, Statsbiblioteket. In order tofacilitate research, the library has made an increasingamount of historically interesting Danish sound filesavailable on the Web [20].
2.5. Emerging standards
Some of the new emerging W3C standards areof special interest to the subject of this paper. Thissection will briefly describe these and discuss the im-plications for temporal media on the Web. It shouldbe noted, however, that these are still evolving stan-dards, and can be expected to undergo some trans-formation before being finalised.SMIL [19] (Synchronised Multimedia IntegrationLanguage) is a recent Recommendation from W3C.It is an effort to support the layout and synchro-nisation of multimedia clips, e.g. synchronising avideo clip with animations or a slide show with au-dio narrative and HTML documents. The standardis presentation oriented, and as such requires anauthoring tool to modify.HTML
TIME [12] (Timed Interactive Multime-dia Extensions for HTML) is a proposed standard
inspired by SMIL, and is concerned with the imple-mentation of SMIL concepts in HTML. It thus addsa concept of timing to HTML and allows any HTMLelement to appear for a defined duration. Of specialinterest in this context is the timed hyperlink, and theintegration of media players, which can be addressedand timed. Not only can the players be set to beginplaying at a predetermined time, it is also possibleto address inside the media clip and thus play asegment of a media clip, using the
attributes. As HTML elements (in partic-ular links) can be synchronised with other elements’timing events, it would be simple to create links thatwould synch with segments of a media clip, e.g.links appearing during a segment of a video clip anddisappearing after the segment had been played.As such HTML
TIME has certain requirementsof content handlers that fit nicely with the require-ments of links to and from temporal media. Specif-ically media players should be able to synch withevents, as well allow the showing of segments. Thisis however a very new (proposed) standard and so farno Web browsers or media players the authors areaware of meet the HTML
TIME requirements.A general problem with SMIL and HTML
TIMEfrom the standpoint of the authors is that these stan-dards address authoring of presentations and thus isin principle (though admittedly far more advanced)no different than the existing HTML authoring tools.The basic problem of links being created exclusivelyby the author of the document remains. These stan-dards also address a somewhat different problemthan linking to and from temporal media, as a keypoint of SMIL and HTML
TIME is synchronisationand presentation.Both standards are, as noted above, still in devel-opment and may very well change in the future.XML defines a way to describe structured dataand documents. XPointer and XLink (XML LinkingLanguage) are the accompanying resource locationand linking standards. As of this writing, neitheris in ‘Recommended’ state from W3C, but this isexpected to happen sometime in 1999.XPointer [17] is designed to specify locationsin XML and other well-structured documents. Thesyntax is based on the Text Encoding Initiative ‘ex-tended pointer’. This is a rather compact and tree-oriented notation, which might take the followingform:
.This expression would start at the entity identified as‘foo’, continue at this entity’s third child of the type‘SEC’, and end with that child’s fourth child of thetype ‘LIST’. This format is not XML, which mightcome as a surprise, but it has the advantage that itcan be used in URLs.XLink [16] is the language that ties XPointerstogether in links. XLink links are not limited toXPointers — while endpoints in XML documentstypically will be described with XPointers, XLink links can have any Web resource as a destination.Links may be in-line (as in HTML) or out-of-line,that is residing outside the linked documents. XLink supports bi-directional multiheaded links. Out-of-line links may be stored in simple text files orhandled by link bases. XLink does not offer anyprotocol for such link servers.XLink and XPointer are, from a Web augmenta-tion standpoint, mainly interesting if XML becomeswidespread on the Web. While the linking constructsare quite powerful in XML documents, the standardsare not aimed at improving the state of the art withrespect to HTML, that is not well-formed, nor doesit address the intricacies of other media types, suchas video or audio.
3. The Arakne Environment
The Arakne Environment, shown in Fig. 2, is aruntime environment aimed at supporting Web aug-mentation tools. The environment is primarily butnot exclusively aimed at providing advanced hyper-media functionality to the Web. The environmentis based on the Arakne framework [4], which is ageneral Web augmentation model, and was designedto be open and extensible. It currently supports anavigational hypermedia tool, Navette, and a guidedtour tool, Ariadne [14].The framework may support any number of Webaugmentation tools. These tools (known as ‘navlets’)are dependent on four core components of theArakne framework: the Operations, the Hyperstruc-ture Store, the Browser, and the Proxy. The navletis the domain-specific part of a Web augmentationtool. It provides a user interface as well as speciallogic to handle the specific domain. This may in-
of 13