Second edition 2003-04-08 incorporating TC1,
for ISO/IEC 15445:2000 first edition 2000-05-15,
corrected version 2003-##-##.
TYPE=reset
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission) form the specialized system for worldwide standardization. National bodies that are members of ISO or IEC participate in the development of International Standards through technical committees established by the respective organization to deal with particular fields of technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the work.
International Standards are drafted in accordance with the rules given in ISO/IEC Directives, Part 3.
In the field of information technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1. Draft International Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as an International Standard requires approval by at least 75% of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this International Standard may be the subject of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
International Standard ISO/IEC 15445 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology, Subcommittee SC34, Document description languages. JTC1/SC34 has worked on this project in close cooperation with the World Wide Web Consortium. This International Standard makes normative reference to the W3C Recommendation for HTML 4.01.
Annexes A and B form a normative part of this International Standard.
This corrected version of the International Standard includes normative technical changes, altering the requirements and recommendations for the use of the W3C Recommendation for HTML 4.01, and extends the support for accessibility to the World Wide Web. The changes result in part from practical experience with the language defined by this International Standard and in part from the World Wide Web Consortium's adoption of HTML 4.01 as the reference specification for the HTML 4 language. Details of the changes are provided in the Supplement to the corrected version of ISO/IEC 15445:2000.
In November 1996 we were authorized to act as the project editors of the ISO/IEC International Standard 15445:2000 for HTML, informally known as "ISO-HTML". The formal specification, published on May 15th 2000, that we developed is intended for SGML experts who are familiar with the SGML family of International Standards, and as such is challenging to read. However we wanted the standard to be accessible to readers who do not spend their time working on SGML standards. This User's Guide is intended to encourage and assist people wishing to develop high quality IT applications on the World Wide Web and set high standards of document design and management. We assume a familiarity with the W3C Recommendation for HTML 4.01, but the reader is not expected to be an expert in SGML.
We have received help and encouragement from many people during the development of the International Standard and this User's Guide. Many members of the IETF HTML Working Group commented on the early strawman which led to the formal introduction of the ISO-HTML project. We also received assistance from the staff of the World Wide Web Consortium (W3C) and from people in W3C member organizations. We have worked in close cooperation with the W3C Working Group which developed the HTML Recommendation, and at the invitation of the W3C, we have taken the W3C Recommendation for HTML 4.01 as a referenced text. The ISO/IEC Working Group responsible for the SGML family of standards have provided us with direction, and encouraged and supported our close liaison with the W3C. We have also received help directly from members of National Bodies and from members of the public commenting in general mailing lists. A special word of thanks and appreciation is due to Dave Raggett who accepted an invitation to act as Invited Expert at the Dublin meeting held in July 1997 which established the principle of technically harmonized text, and made ISO-HTML a true subset of the W3C HTML specification.
This Guide is not a formal document, neither is it intended as a reference specification, and it is not appropriate to cite it as such. However, if you the reader find it useful, then we will have met our objectives.
David M. Abrahamson
Trinity College Dublin.
d a v i d at c s dot t c d dot i e
Roger Price
University of Massachusetts Lowell.
r p r i c e at c s dot u m l dot e d u
This second edition of the User's Guide, and the International Standard, are now generated from the same source file using ISO 8879 based technology, thus simplifying maintenance and ensuring technical alignment. The common source file is marked up using the Pre-HTML DTD specified in this User's Guide and then transformed to conforming instances of ISO/IEC 15445 using the technique described in the chapter "Document preparation".
The additional material added by the Guide is marked up with the
attribute class="UG"
. A W3C CSS2
style sheet associates class "UG
" with the style of this
paragraph. The text introduced by Technical
Corrigendum 1 is highlighted in this style. If there is to be a
Technical Correndum 2, it will be highlighted in
this style, and so on...
We would like to thank Russell O'Connor, Michael Huang, Nicolas Lesbats and Edward Welbourne for their helpful comments and suggestions.
Copyright © 2000-2003 Roger Price, David Abrahamson. All Rights Reserved.
This guide is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
The names of the copyright holders may NOT be used in advertising or publicity pertaining to this document or its contents without specific, written prior permission.
This document describes ISO/IEC 15445:2000 which is subject to IETF, W3C (MIT, Inria, Keio) and ISO/IEC copyright. Because the U.S. Department of Energy, has supported the development of International Standards by JTC1/SC34 (under contract DE-AC05-84OR21400), it makes the following assertion about the International Standard:
The U.S. Government retains a paid-up, nonexclusive, irrevocable, world-wide license to publish or reproduce the published form of these documents, prepare derivative works, distribute copies to the public, and perform publicly and display publicly, or to allow others to do so, for U.S. Government purposes.
The HyperText Markup Language (HTML) is an application of the International Standard ISO 8879 -- Standard Generalized Markup Language (SGML). It provides a simple way of structuring hypertext documents and of placing references in one document which point to another. These references, called "links", may be presented to readers of a document in such a way that a simple "click" summons the other document, which is then presented to the reader. The reader has the impression of moving from one document to another. This simple user interface has been wildly successful and as a result the World Wide Web, the "web", has become extremely popular.
In the frenzy of the growth, much of the discipline and good practice of the mature SGML world has been lost, and browser developers have added additional features to the markup language such as new tags and new semantics for tags. As a result, many documents have been created which can only be rendered faithfully on a limited number of browsers. Common web practice is to hide any syntactic problems detected by the browsers and thus the reader is not aware that a page being browsed is not always faithful to the original authored document.
The International Standard was developed in an effort to ensure that it will remain possible for an author to produce simple hypertext for the web and be confident that a conforming browser will be able to render the document faithfully. ISO/IEC 15445 represents a core of the language to be supported by all conforming browsers, authoring and validating systems. This International Standard is a refinement of the World Wide Web Consortium's (W3C's) Recommendation for HTML 4.0: it provides further rules to condition and refine the use of the W3C Recommendation in a way which emphasizes the use of stable and mature features, and represents accepted SGML practice. Documents which conform to this International Standard also conform to the strict DTD provided by the W3C Recommendation for HTML 4.01.
ISO-HTML omits all deprecated features of the language, features whose role is purely cosmetic, and features which are still unstable or immature. This has been done in preparation for the expected wide adoption of style sheets by authors and browser manufacturers. Certain optional facilities such as markup omission of the document and other major elements have been removed to produce more robust texts in keeping with recognized good SGML practice. This does not reduce in any way the expressive power of the language.
This International Standard makes a clear and important distinction between conforming systems and validating systems. A conforming system operates correctly when handling documents which conform to this International Standard, but is not required to operate correctly when the documents do not conform. A validating system is more powerful: it detects all SGML and HTML errors in a document, and must be able to certify that a document is valid ISO-HTML. Frequently browsers are conforming systems whereas authoring tools should check for validity. Authoring tools which issue broken, non conforming pages are a major cause of the low quality of many sites.
NOTE: A conforming system is not sufficient to validate an ISO-HTML document. A validating system is required.
This International Standard does not define error handling procedures for user agents: It emphasises validation at source rather than error handling at the destination.
A minimal ISO-HTML document has the form:
<!DOCTYPE HTML PUBLIC "ISO/IEC 15445:2000//DTD HTML//EN">
<HTML>
<HEAD>
<TITLE>Les unités de base</TITLE>
... other head elements ...
</HEAD>
<BODY>
<P>La seconde...
... remainder of document body ...
</BODY>
</HTML>
This User's Guide follows the convention of presenting element and attribute names in upper case, although there is no formal requirement for the practice.
NOTE: ISO-HTML is an application of SGML and the SGML declaration used calls for upper case folding of all names except entity names. (XHTMLTM is an application of XMLTM and has an SGML declaration which does not call for upper case folding, ie. XHTML names are case sensitive whereas names in ISO-HTML and the W3C Recommendation for HTML 4.01 are not.)
In order to support world wide use of the markup language, the internationalization facilities specified by the IETF in RFC2070 have been included in the International Standard. It is recognised that full compliance to RFC2070 will be progressive and the conformance clause allows for progressive compliance to the use of ISO 10646.
To facilitate the use of this User's Guide, frequent references are provided to the W3C Recommendation for HTML 4.01. These take two forms: hyperlinks to the W3C's electronic sources, and clause number references to the W3C's printed version of their specification in the style [W3C 12.3].
The World Wide Web Consortium have prepared a Recommendation for XHTMLTM which recasts the W3C Recommendation for HTML 4.01 as an application of XML.
The scope of this International Standard is a conforming application of ISO 8879, SGML. This International Standard describes the way in which the HTML language specified by the following clauses in the W3C Recommendation for HTML 4.01 shall be used, and does so by identifying all the differences between the HTML language specified by the W3C Recommendation for HTML 4.01 and the HTML language defined by this International Standard:
<BIG>
[W3C 15.2.1],
<SMALL>
[W3C 15.2.1], <STRIKE>
[W3C 15.2.1], <S>
[W3C 15.2.1] and <U>
[W3C 15.2.1] element [type]s. The scope excludes any material in the W3C Recommendation for HTML 4.01 not listed in this clause. It also excludes any standardization of models, services, systems, protocols or applications which are likely to make use of the ISO-HTML language. ISO-HTML does not define the "look and feel" of any conforming product, and provides only sufficient semantics to allow a reader who is familiar with the W3C Recommendation for HTML 4.01 to have an intuitive idea of the requirement.
This International Standard distinguishes between conforming documents, validating systems, conforming systems and character set conformance.
The distinction between validating systems and conforming systems is very important.
A validating system is one that is able to verify that the document it is processing contains correct HTML. If the document is correct, the validator certifies it as such; if not, the validator identifies the errors. The notion of validation is currently poorly defined on the World Wide Web and many authors assume wrongly that their browser may be used to check out the pages they write.
I tried it with my browser and it worked!
is an all too common mistake, and is the source of many errors and broken pages.
ISO-HTML insists that a validator requires an SGML parser since ISO-HTML makes full use of the underlying SGML language. Conforming systems do not require an SGML parser since they merely promise to operate correctly provided that the documents they process are already validated as conforming to ISO-HTML.
NOTE: It is possible for a system that is simply "conforming" to identify many errors in an invalid document, and notification of such errors could be of value to a user, but it is not "validating" unless it can detect all errors.
A document which conforms to this International Standard shall
<HTML>
[W3C 7.3] document element. The
document type declaration may be surrounded by
white space consisting of RS, RE, SPACE, TAB and HTML
comments. The document instance may also be followed by such
white space.
In other words, to be a conforming document, documents are required to have the following structure:
"White space" is the term used by programmers for the characters between tokens, even if the style sheet makes them appear in some other colour. We will use the common term.
<HTML>
[W3C 7.3] document
element.White space consists of the SGML-defined characters RS (record start), RE (record end), SEPCHAR (tab) and SPACE [8879 9.2.1 figure 2], and ISO-HTML comments.
An HTML system is a validating HTML system if
The International Standard does not say how the validation system is to report the
errors: whether this is "one at a time" or "all at once" is left to
the implementor. The SP parser provides the -E
option
with which the user may specify a maximum number of error messages to
be displayed. This is useful for checking pages of possibly very low
quality.
NOTE: This requires more than a validating SGML parser is able to offer, nevertheless a validation by an SGML parser is an essential first step. Some of the ISO-HTML errors a validating system is required to detect cannot be detected by an SGML parser, and require further processing.
Validating systems are required by ISO-HTML to display a text identifying them clearly as validating systems.
Validating systems conforming to this International Standard shall display the following identification text prominently and in the national language of the documentation:
The HTML validating system identification text is:
An HTML validating system conforming to International Standard ISO/IEC 15445—HyperText Markup Language, and International Standard ISO 8879—Standard Generalized Markup Language (SGML).
NOTE: The validating system identification text is copyrighted by the ISO/IEC, but may be used without further permission or further reference to the ISO/IEC.
NOTE: Neither the ISO nor the IEC provide a certification service, nor do they provide an icon to indentify validating or conforming systems. The ISO and IEC icons are copyrighted and cannot be used without the permission of those organisations. The International Standard gives permission to use the identification text but not the icon.
A conforming HTML system is an HTML system which is able to process all documents conforming to this standard.
The International Standard says nothing about error handling or the processing of non-conforming documents. The basic creed is that in a high quality web application, all documents are validated as conforming before publication, and that conforming documents are sent to conforming user agents to obtain correct results.
Nevertheless, a prudent implementor of a program which is just a conforming system would be wise to guard against broken HTML, perhaps maliciously fed to the program in an attempt to provoke a buffer overrun and defeat security mechanisms.
The documentation of conforming systems in much the same way as validating systems. The only difference is the identifying text itself. It is important that the documentation not claim or suggest that a conforming system may be used to validate ISO-HTML documents.
Conforming systems shall display the following identification text prominently and in the national language of the documentation:
The HTML conforming system identification text is:
An HTML system conforming to International Standard ISO/IEC 15445—HyperText Markup Language.
The documentation shall not claim or imply that the system may be used to validate HTML documents.
The SGML declaration provided with this International Standard calls for the use of
ISO/IEC 10646 Universal Multiple-Octet Coded Character Set
(UCS). ISO/IEC 10646 specifies a large number of facilities
from which different selections may be made to suit individual
applications. ISO/IEC 10646 is
potentially very large and although the described character set
portion identified by the DESCSET
keyword [8879 13.1.1.2]
calls for the whole character set, the International Standard does not require that it
is fully implemented in any user agent. As a result it is
only practicable to envisage limited conformance to ISO/IEC 10646
as defined in this subclause.
ISO-HTML takes the same approach as was taken by ISO 2022, and this subclause is based on ISO 2022 clause 3.
Under limited conformance, the following is required:
NOTE: The International Standard does not say how the problem is to be explained. This is left entirely to the implementor to decide. Neither does the International Standard discuss any negotiation that might be done, or the operation of the HTTP protocol.
The UTF-1 transformation format of ISO/IEC 10646, registered by IANA as ISO-10646-UTF-1, has been removed from ISO/IEC 10646 and should not be used.
The following normative documents contain provisions which, through reference in this text, constitute provisions of this International Standard. For dated references, subsequent amendments to, or revisions of, any of these publications do not apply. However, parties to agreements based in this International Standard are encouraged to investigate the possibility of applying the most recent editions of the normative documents indicated below. For undated references, the latest edition of the normative document referred to applies. Members of IEC and ISO maintain registers of currently valid International Standards.
NOTE: In an ISO/IEC specification, a normative reference has the effect of including all the provisions of the referenced text into the referencing text. The W3C Recommendation itself contains normative references, but it is implicit that the effect is not one of "total normative inclusion". The W3C normative references appear to be closer in spirit to ISO/IEC informative references defining good practice, and we recommend that they should be treated as such.
This International Standard refers normatively to:
NOTE: The W3C Recommendation contains futher normative references and defines their application.
NOTE: ISO/IEC 10744 HyTime provides the techniques required to allow HTML to be used as a base architecture for other SGML applications.
NOTE: HyTime is a large, 472 pages, and complex International Standard which provides a language and underlying model for the representation of "hyperdocuments" that link and synchronize static and dynamic (time-based) information contained in multiple conventional and multimedia documents and information objects. The language is known as the "Hypermedia/Time-based Structuring Language", or "HyTime". HyTime is an application of SGML. In Annex A.3.1.1 it defines architectural forms which are fragments of DTD that may be incorporated into some other DTD. HyTime provides syntax to specify the fragment. The ISO-HTML DTD may be used as such an architectural form.
For the purposes of this International Standard, the definitions given in ISO 8879:1986 and the following definitions apply:
HREF
attribute
value following the `#
' character.
All the definitions of SGML are incorporated into ISO-HTML.
The multiple definitions and techniques for the representation of characters may be the source of confusion. The following figure shows some of the ideas involved. It is based on the character set defined by ISO 8859-1:1987 "8-bit single-byte coded graphic character sets", Part 1: Latin alphabet No. 1.
The following symbols and abbreviated terms are used in this International Standard:
This International Standard has been designed to satisfy the following requirements:
NOTE: The techniques for using HTML as a base architecture are provided by ISO/IEC 10744 HyTime.
The International Standard states the requirements it had to meet in terms of the relationship between ISO-HTML and SGML and the need for ISO-HTML to be viewable with browsers which conform to the W3C Recommendation for HTML 4.01.
The underlying requirements were to:
id
attribute rather than the name
attribute.
This allows an SGML parser to check that the value is unique.
<FONT>
[W3C 15.2.2] and attributes such
as BGCOLOR
which provide style rather than structure.Throughout this User's Guide, references to the printed version of the referenced text are given in the abbreviated style [W3C 12.3].
The set of element types provided by this International Standard is a subset of the set of element [type]s defined by the W3C Recommendation for HTML 4.01. The set of attributes provided for each element type included in this International Standard is a subset of the corresponding set defined by the W3C Recommendation for HTML 4.01. The set of element types and the sets of attributes are defined by the DTD provided with this International Standard.
Where refinements are defined for element types and attributes, the semantics are a subset of the semantics defined by the W3C Recommendation for HTML 4.01 in the sense that the set of documents conforming to this International Standard is a subset of those conforming to the W3C Recommendation for HTML 4.01.
NOTE: For clarity, and as required by ISO 8879, this International Standard makes a distinction between an individual element with a given generic identifier and the class of all such elements. The class is called an element type, the instance is called an element and the generic identifier is called an element type name.
ISO 8879 distinguishes between element type [8879 11.2.1] and element [8879 7.3], which is an instance of the type, whereas the W3C Recommendation for HTML 4.01 uses the term "element" for both element and element type. This guide follows the ISO practice, and when quoting from the W3C Recommendation for HTML 4.01 inserts the missing word in square brackets when it is needed, eg. [type].
While the syntax of ISO-HTML is defined by the DTD provided by the International Standard, the semantics of the following element types are defined normatively in the W3C Recommendation for HTML 4.01:
<ABBR>
[W3C 9.2.1]—Abbreviation<ACRONYM>
[W3C 9.2.1]—Acronym<B>
[W3C 15.2.1]—Bold character style<BDO>
[W3C 8.2.4]—Bidirectional override<BR>
[W3C 9.3.2]—Line break<CAPTION>
[W3C 11.2.2]—Table caption<CITE>
[W3C 9.2.1]—Citation<CODE>
[W3C 9.2.1]—Program code<DD>
[W3C 10.3]—Definition data<DEL>
[W3C 9.4]—Deleted material<DFN>
[W3C 9.2.1]—Defining instance<DIV>
[W3C 7.5.4]—Document division<DL>
[W3C 10.3]—Definition list<DT>
[W3C 10.3]—Definition term<EM>
[W3C 9.2.1]—Emphasized text<FIELDSET>
[W3C 17.10]—Group of form items<FORM>
[W3C 17.3]—Forms<HR>
[W3C 15.3]—Horizontal rule<I>
[W3C 15.2.1]—Italic character style<INS>
[W3C 9.4]—Inserted material<KBD>
[W3C 9.2.1]—Keyboard input<LEGEND>
[W3C 17.10]—Fieldset label<LI>
[W3C 10.2]—List item<META>
[W3C 7.4.4]—Document meta-information<OL>
[W3C 10.2]—Ordered list<OPTGROUP>
[W3C 17.6]—Group of user choices<OPTION>
[W3C 17.6]—User choice<P>
[W3C 9.3.1]—Paragraph<PARAM>
[W3C 13.3.2]—Agent interface parameter<PRE>
[W3C 9.3.4]—Preformatted text<SAMP>
[W3C 9.2.1]—Sample output<SELECT>
[W3C 17.6]—Form selection<SPAN>
[W3C 7.5.4]—Generic container<STRONG>
[W3C 9.2.1]—Strong emphasis<SUB>
[W3C 9.2.3]—Subscript character style<SUP>
[W3C 9.2.3]—Superscript character style<TEXTAREA>
[W3C 17.7]—Multi-line text field<TFOOT>
[W3C 11.2.3]—Table footer<THEAD>
[W3C 11.2.3]—Table header cell<TITLE>
[W3C 7.4.2]—Document title<TT>
[W3C 15.2.1]—Monospaced character style<UL>
[W3C 10.2]—Unordered list<VAR>
[W3C 9.2.1]—Generic variableNOTE: In case you are curious, the lettered list is the official ISO style for lists.
The definitions of the following element types are refined by the International Standard:
<A>
[W3C 12.2]—Source and target anchors<ADDRESS>
[W3C 7.5.6]—Author's address<AREA>
[W3C 13.6.1]—Image map region<BLOCKQUOTE>
[W3C 9.2.2]—Block quotation<BODY>
[W3C 7.5.1]—Document body<BUTTON>
[W3C 17.5]—Selectable input mechanism<COL>
[W3C 11.2.4]—Table column properties<COLGROUP>
[W3C 11.2.4]—Table column group properties<HEAD>
[W3C 7.4.1]—Document header<HTML>
[W3C 7.3]—Document instance<H1>
[W3C 7.5.5]—Major section header<H2>
[W3C 7.5.5]—Section header<H3>
[W3C 7.5.5]—Subsection header<H4>
[W3C 7.5.5]—Subsubsection header<H5>
[W3C 7.5.5]—Subsubsubsection header<H6>
[W3C 7.5.5]—Minor subsubsubsection header<IMG>
[W3C 13.2]—Inline images<INPUT>
[W3C 17.4]—User input field<LABEL>
[W3C 17.9.1]—Form field label<LINK>
[W3C 12.3]—Interdocument relations<MAP>
[W3C 13.6.1]—Client-side image map<OBJECT>
[W3C 13.3]—Simple agent<Q>
[W3C 9.2.2]—Quote<STYLE>
[W3C 14.2.3]—Style specification<TABLE>
[W3C 11.2.1]—Tables<TBODY>
[W3C 11.2.3]—Table body<TD>
[W3C 11.2.6]—Table data cell<TH>
[W3C 11.2.6]—Table header cell<TR>
[W3C 11.2.5]—Table rowAny element type not listed in this or the preceding subclause is excluded from the International Standard.
The W3C Recommendation for HTML 4.01 provides a number of attributes that are not supported by the International Standard. They have been omitted because they are used to describe appearance rather than structure, or because the feature is considered to be still too unstable or immature for an International Standard.
ALIGN
—Omitted from all elements on which it occurs. ALINK
—Omitted from all elements on which it occurs. ALT
—Omitted from <INPUT>
[W3C 17.4].ARCHIVE
—Omitted from <OBJECT>
[W3C 13.3].BACKGROUND
—Omitted from <BODY>
[W3C 7.5.1].BGCOLOR
—Omitted from all elements on which it occurs.BORDER
—Omitted from all elements on which it occurs. CELLPADDING
—Omitted from <TABLE>
[W3C 11.2.1].CELLSPACING
—Omitted from <TABLE>
[W3C 11.2.1].CHAR
—Omitted from all elements on which it occurs. CHAROFF
—Omitted from all elements on which it occurs. CLEAR
—Omitted from <BR>
[W3C 9.3.2].COMPACT
—Omitted from all elements on which it occurs. COORDS
—Omitted from <A>
[W3C 12.2].FRAME
—Omitted from <TABLE>
[W3C 11.2.1].HEIGHT
—Omitted from all elements on which it occurs. HSPACE
—Omitted from all elements on which it occurs. LINK
—Omitted from <BODY>
[W3C 7.5.1].NAME
—Omitted from <FORM>
[W3C 17.3].NAME
—Omitted from <IMG>
[W3C 13.2].NOSHADE
—Omitted from <HR>
[W3C 15.3].NOWRAP
—Omitted from <TD>
[W3C 11.2.6] and <TH>
[W3C 11.2.6].ONBLUR
—Omitted from all elements on which it occurs. ONCHANGE
—Omitted from all elements on which it occurs. ONCLICK
—Omitted from all elements on which it occurs. ONDBLCLICK
—Omitted from all elements on which it occurs. ONFOCUS
—Omitted from all elements on which it occurs. ONKEYDOWN
—Omitted from all elements on which it occurs. ONKEYPRESS
—Omitted from all elements on which it occurs. ONKEYUP
—Omitted from all elements on which it occurs. ONLOAD
—Omitted from all elements on which it occurs. ONMOUSEDOWN
—Omitted from all elements on which it occurs. ONMOUSEMOVE
—Omitted from all elements on which it occurs. ONMOUSEOUT
—Omitted from all elements on which it occurs. ONMOUSEOVER
—Omitted from all elements on which it occurs. ONMOUSEUP
—Omitted from all elements on which it occurs. ONRESET
—Omitted from all elements on which it occurs. ONSELECT
—Omitted from all elements on which it occurs. ONSUBMIT
—Omitted from all elements on which it occurs. ONUNLOAD
—Omitted from all elements on which it occurs. RULES
—Omitted from <TABLE>
[W3C 11.2.1].SHAPE
—Omitted from <A>
[W3C 12.2].SIZE
—Omitted from <HR>
[W3C 15.3].SRC
—Omitted from <INPUT>
[W3C 17.4].START
—Omitted from <OL>
[W3C 10.2].STYLE
—Omitted from all elements on which it occurs. TARGET
—Omitted from all elements on which it occurs.TEXT
—Omitted from <BODY>
[W3C 7.5.1]. TYPE
—Omitted from <LI>
[W3C 10.2], <OL>
[W3C 10.2] and <UL>
[W3C 10.2].USEMAP
—Omitted from <INPUT>
[W3C 17.4].VALIGN
—Omitted from all elements on which it occurs. VALUE
—Omitted from <LI>
[W3C 10.2].VERSION
—Omitted from <HTML>
[W3C 7.3].VLINK
—Omitted from all elements on which it occurs. VSPACE
—Omitted from all elements on which it occurs. WIDTH
—Omitted from all elements on which it occurs. This clause in the International Standard covers matters that are not associated with a particular element.
When an HTML text is transmitted as a multibyte character set UCS-2 or UCS-4, this International Standard follows RFC2070 and recommends:
The International Standard defines two classes of structure: block element types and text element types. The two classes are defined in the ISO-HTML DTD by the entities:
%block;
<BLOCKQUOTE>
[W3C 9.2.2], <DIV>
[W3C 7.5.4], <DL>
[W3C 10.3],
<FIELDSET>
[W3C 17.10], <FORM>
[W3C 17.3], <HR>
[W3C 15.3], <OL>
[W3C 10.2], <P>
[W3C 9.3.1], <PRE>
[W3C 9.3.4], <TABLE>
[W3C 11.2.1] and <UL>
[W3C 10.2].
NOTE: The %block;
class corresponds to the %block
;
parameter entity in the W3C Recommendation for HTML
4.01, but excludes the %heading
;
element [type]s and the <ADDRESS>
[W3C 7.5.6] element [type].
%text;
NOTE: The %text;
class corresponds to the %inline
; parameter
entity in the W3C Recommendation for HTML
4.01, but without the %formctrl
; element [type].
The subclasses are defined by entities:
%physical.styles;
<B>
[W3C 15.2.1], <I>
[W3C 15.2.1],
<SUB>
[W3C 9.2.3], <SUP>
[W3C 9.2.3] and <TT>
[W3C 15.2.1].
NOTE: The physical styles are called %fontstyle
; in the W3C Recommendation for HTML
4.01.
ISO-HTML adds <SUB>
[W3C 9.2.3] and <SUP>
[W3C 9.2.3] taken from %special
;, and
sorts the set into alphabetical order.
%logical.styles;
<ABBR>
[W3C 9.2.1],
<ACRONYM>
[W3C 9.2.1], <CITE>
[W3C 9.2.1], <CODE>
[W3C 9.2.1], <DFN>
[W3C 9.2.1], <EM>
[W3C 9.2.1], <KBD>
[W3C 9.2.1], <SAMP>
[W3C 9.2.1],
<STRONG>
[W3C 9.2.1] and <VAR>
[W3C 9.2.1].
NOTE: %logical.styles;
are called %phrase
; in the W3C Recommendation for HTML
4.01.
The ISO-HTML DTD presents the elements in alphabetical order.
%special;
<A>
[W3C 12.2], <BDO>
[W3C 8.2.4], <BR>
[W3C 9.3.2],
<IMG>
[W3C 13.2], <OBJECT>
[W3C 13.3], <MAP>
[W3C 13.6.1], <Q>
[W3C 9.2.2] and <SPAN>
[W3C 7.5.4].
NOTE: The ISO-HTML special subclass corresponds to %special
; in the
W3C Recommendation for HTML
4.01, but excludes <SUB>
[W3C 9.2.3] and <SUP>
[W3C 9.2.3] which ISO-HTML considers to
be physical styles. Those that are included are in alphabetical
order.
The distinction between block elements and text elements appears in
For details, see the W3C Recommendation for HTML 4.01.
The DTD provided by this International Standard has the following formal public identifiers:
"ISO/IEC 15445:2000//DTD HyperText Markup Language//EN" "ISO/IEC 15445:2000//DTD HTML//EN"
NOTE: The second formal public identifier is shorter, but has exactly the same meaning as the first.
The DTD is typically invoked by one of the following declarations:
<!DOCTYPE HTML PUBLIC "ISO/IEC 15445:2000//DTD HyperText Markup Language//EN"> <!DOCTYPE HTML PUBLIC "ISO/IEC 15445:2000//DTD HTML//EN">
The document type declaration shall not include a document type declaration subset [8879 11.1].
NOTE: The DTD provides an optional mechanism to facilitate the production of conforming documents. The optional mechanism, which is not a part of this International Standard, allows an SGML parser to verify the correct nesting of sections and requires the use of an alternative document type declaration which is described in the User's Guide to ISO/IEC 15445. The Guide also provides descriptions of the SGML techniques used in the documentation preparation process.
The exclusion of the document type declaration subset [8879 11.1] by the International Standard prevents the use of parameter entities [8879 B.6] in conforming documents. Parameter entities declared in the subset can be useful in documents in the same way that macros are useful in programming languages. We will explain later how to take advantage of the power of parameter entities when preparing ISO-HTML documents. This will require a modified document type which is invoked by the document type declaration:
<!DOCTYPE Pre-HTML PUBLIC
"-//ISO-HTML User's Guide//DTD Preparation of ISO-HTML//EN"
[<!ENTITY % Preparation "INCLUDE" >
general entity declarations...
]>
This modified document type declaration is not a part of the International Standard, but is useful in preparing documents which conform to ISO-HTML.
NOTE: The International Standard and this User's Guide were prepared from a common source marked up using the modified document type declaration.
In order to use the HTML document type definition as a base architecture for other SGML applications, one of the the following architectural support declarations should be used:
<!ENTITY % HtmlDtd PUBLIC "ISO/IEC 15445:2000//DTD HTML//EN">
<?IS10744 ArcBase HTML>
<!NOTATION HTML PUBLIC
"-//ISO-HTML User's Guide//NOTATION HTML Architecture//EN">
<!ATTLIST #NOTATION HTML
ArcDTD CDATA #FIXED "%HtmlDtd" -- Meta-DTD entity --
ArcDocF NAME #FIXED "HTML" -- Document element name --
ArcNamrA NAME #IMPLIED -- Default: no renaming --
-- See [HyTime A.3.4.2] --
>
or
<?IS10744
arch name="html"
public-id="ISO/IEC 15445:2000//DTD HyperText Markup Language//EN"
dtd-system-id="./15445.dtd"
renamer-att="HTMLnames"
doc-elem-form="HTML"
>
NOTE: These two architectural support declarations are equivalent.
NOTE: In the first form, the attribute ArcNamrA
may be
defined as ArcNamrA NAME #FIXED "HTMLnames"
if renaming
is required [HyTime, A.3.5.2].
NOTE: The Processing instruction [8879, clause 8] based mechanisms used in the second form have not yet been published (February 2003). The International Standard forgot to give the first form.
The International Standard makes the comments in the DTD a part of the normative text.
The comments in the DTD which use the expressions "shall" or "shall not" are normative requirements of this International Standard. Comments which use the expression "should" or "should not" are recommendations of this International Standard. Comments which use the verbs "recommend" or "deprecate" are recommendations and deprecations of this International Standard.
NOTE: DTD comments in the W3C Recommendation for HTML 4.01 are informative only.
The document type definition (DTD) [8879 11.1] provided by ISO-HTML is divided into three parts which are grouped within a single file. Part 1 is a set of entity definitions required by the DTD and forms the ISO-HTML entity set. Part 2 defines the ISO-HTML element types and their content models, and Part 3 defines the attribute sets for each element type and provides additional normative refinements.
The International Standard also provides an SGML declaration [8879 13] which gives instructions to the SGML parser.
NOTE: The ISO-HTML SGML declaration is essentially the same as the SGML declaration in the W3C Recommendation for HTML 4.01.
The formal SGML definitions, i.e the ISO-HTML DTD and the ISO-HTML SGML declaration are part of the text of this International Standard and are protected by copyrights held by the IETF, the W3C (MIT, Inria, Keio) and the ISO/IEC. Permission to copy is granted provided the following copyright notice is included with all copies:
Permission to copy in any form is granted for use with validating and conforming systems and applications as defined in ISO/IEC 15445:2000, provided this copyright notice is included with all copies.
This provision allows you to make electronic copies of the file that contains the ISO-HTML DTD and the file that contains the SGML declaration. Make sure that the copies that you use are pristine. They should have the following 128 bit MD5 Message-Digest Algorithm checksums specified by RFC1321 and calculated by the GNU md5sum utility for text (not binary) files:
52a4de8d16bc469f42801924384d84fa 15445.dcl
cb098831761d5d7458084d6076c2d6eb 15445.dtd
NOTE: The OASIS catalogue fragment described in this User's Guide is not a part of the International Standard. It may be copied without payment under the terms of the GNU General Public License.
NOTE: The checksums which appear in this clause are an example of automatically computed text in an ISO-HTML page. The technique is described in chapter Document preparation.
This International Standard requires a complete separation of style and content.
The International Standard is based on the well established principle that it is good document design to separate the content of a document from the intended style in which it is to be presented to a reader. This facilitates the reprocessing of documents in ways that were not envisaged when they were created, and thus protects the content owners' long term investment in documents.
A <STYLE>
[W3C 14.2.3] element may be used in the head of a document as a
container for a style sheet. The style sheet language is not defined
by this International Standard.
Although the International Standard does not specify a style sheet language, this User's Guide recommends that authors of ISO-HTML documents use Cascading Style Sheets as specified by the World Wide Web Consortium.
Wherever this International Standard describes a possible presentation, eg. as a button, the styling information is intended to provide assistance to the reader in understanding the semantics of the element or attribute. It is not intended as a normative style requirement.
All comments in HTML document instances shall appear in comment declarations. There shall be exactly one comment per comment declaration.
SGML differentiates between a comment [8879 10.3] which appears between pairs of double hyphens:
-- This is a comment --
and a comment declaration [8879 10.3] which has the form
<!--comment-- --comment-- --comment-- >
Notice that a comment may be followed by whitespace. The degenerate case
<!>
is allowed by SGML. A common beginner's mistake is to place multiple hyphens in a comment for decorative purposes:
<!----------------------------------------------------
Joe: have the Whizz-Bang lawyers check this out:
---------------------------------------------------->
This example is not valid SGML and it is not valid ISO-HTML, since the additional hyphens are not present in multiples of four.
Validating systems should find an SGML error in such invalid examples
(the characters Joe: have the Whizz-Bang lawyers check this
out: should not appear in whitespace). We leave you to count the
hyphens and appreciate that you should not write --
within a comment.
The International Standard requires that all comments in ISO-HTML documents appear in comment declarations [8879 10.3]. There shall be one and only one comment per comment declaration. For example:
<!-- This is a single comment
in a comment declaration. -->
The intention of this provision is to facilitate the use of popular user agents which are unable to parse SGML and which cannot handle comments outside comment declarations.
The International Standard allows white space following the comment, so an author could write:
<!-- This is a single comment
followed by white space. --
>
The following subchapters describe the refinements that the International Standard makes to those element [type]s defined by the W3C Recommendation for HTML 4.01 which are included in ISO-HTML.
The attributes of the <A>
[W3C 12.2] element are restricted to:
CLASS
and TITLE
.
ID
.
In order to resolve the ID
/NAME
case folding
contradiction, we recommend that authors satisfy the competing
requirements of SGML and the W3C Recommendation for HTML
4.01 by restricting themselves to
the 40 characters "ABCDEFGHIJKLMNOPQRSTUVWXYZ.-_:0123456789" for ID
and NAME
values, and for the corresponding HREF
values.
Do not assume that the values PiZZa
and
pizza
will match; if they are to match, write both as
PIZZA
. Do not assume that PiZZa
and
Pizza
are different; if they are to be different, write
them as PIZZA-1
and PIZZA-2
.
DIR
and LANG
.
ACCESSKEY
, CHARSET
COORDS
The International Standard requires that the COORDS
attribute not be specified if
the SHAPE
attribute has the value default
.
NOTE: Contrary to what one might expect, if the COORDS
attribute is
omitted, it takes the value rect
, not
default
.
HREF
See common attribute ID
above for a recommended restriction to the
character set.
Although the International Standard is silent on the subject, it implies that HREF
values should be tokenized in the same way as NAME
values.
HREFLANG
NAME
See common attribute ID
above for a recommended restriction to the
character set.
In order to clarify which names will match, the International Standard requires that
the attribute value to be tokenized which means that entity
references and character references are replaced, entity ends and
record starts are removed, and record end and separator characters
(horizontal tabs) are replaced by a space. Any sequence of space
characters is replaced by a single space and leading and trailing
spaces are deleted [8879 7.9.3 and 10.1.7]. As a result the
following examples specify the same NAME
value:
NAME="Uncle Joe"
NAME="
Uncle
Joe
"
In SGML terms, this means that NAME
attribute
value specifications are be processed as if the declared value were
NAME even though the declared value is CDATA.
REL
and REV
.
SHAPE
TABINDEX
and TYPE
.
The International Standard
recommends that authors of ISO-HTML documents use
both the ID
attribute and the NAME
attribute. If both are
used, then they shall be given identical values since
this allows an SGML parser to verify that the values for different
anchors are distinct.
The <ADDRESS>
[W3C 7.5.6] element indicates the author or originator of a
document or major part of a document. The International Standard discourages its use
for general markup by requiring that it appear only in the content of
the elements: <BLOCKQUOTE>
[W3C 9.2.2], <BODY>
[W3C 7.5.1], <DIV>
[W3C 7.5.4], <FIELDSET>
[W3C 17.10], <FORM>
[W3C 17.3] and
<OBJECT>
[W3C 13.3].
The <ADDRESS>
[W3C 7.5.6] element should not to be used to markup, for example,
a list of addresses of the members of a club.
ISO-HTML resticts the attributes of the <AREA>
[W3C 13.6.1] element to:
CLASS
, ID
and TITLE
.
DIR
and LANG
.
ACCESSKEY
ALT
We very strongly recommended that authors provide meaningful
ALT
attributes to support interoperability with speech-based and
text-only agents. See 13.8 How to specify alternate text. The language and
direction of text are defined by the containing elements.
COORDS
, HREF
, NOHREF
, SHAPE
and TABINDEX
The International Standard requires that COORDS
not be specified if SHAPE
has the value default
.
NOTE: Contrary to what one might expect, if the COORDS
attribute is
omitted, it takes the value rect
, not
default
.
The International Standard requires that a value be provided for the ALT
attribute,
and that one of HREF
or NOHREF
be specified.
ISO-HTML strengthens a recommendation in the W3C Recommendation for HTML
4.01 by insisting
that the contents of the <BLOCKQUOTE>
[W3C 9.2.2] element be specified without
surrounding quotation marks. These may be added by a user agent
through the use of a style sheet.
NOTE: Authors have recognized that popular browsers often present the
<BLOCKQUOTE>
[W3C 9.2.2] contents indented left and right, and they have misused the
element to obtain this formatting effect for text which was not a block
quotation. True block quotations were marked up with quotation
marks such as "
. The W3C try to provide backward
compatibility in the W3C Recommendation for HTML
4.01 and this prevents them requiring the
omission of quotation marks. ISO-HTML does not have a backward
compatibility requirement, and can insist on quotation mark omission.
This example quotes from article 129C of the European Union Treaty. Here is the markup:
<BLOCKQUOTE
LANG=fr
TITLE="Traité sur l'Union Européenne, Article 129 C.">
<p>
Afin de réaliser les objectifs visés à l'article
129B, la Communauté :
<p>
met en oeuvre toute action qui peut s'avérer nécessaire
pour assurer l'interoperabilité des réseaux, en
particulier dans le domaine de l'harmonisation des
normes techniques ;
</BLOCKQUOTE>
The quotation contains two paragraphs which begin with <p> start tags. Note that the end tags </p> have been omitted. This is allowed in ISO-HTML's SGML-based markup by the omitted tag minimization [8879 11.2.2] specified in the DTD:
<!ELEMENT P - O (%text;)+ >
The "O" says that end-tags may be omitted. In the World Wide Web Consortium's Recommendation for XHTMLTM which is an application of XML, such end tag omission is not allowed and the two end tags </p> would have to be provided. XML has dis-allowed all tag omission.
Here is a possible rendering of the quotation:
<< Afin de réaliser les objectifs visés à l'article 129B,
la Communauté :
met en oeuvre toute action qui peut s'avérer nécessaire
pour assurer l'interoperabilité des réseaux, en
particulier dans le domaine de l'harmonisation des
normes techniques ; >>
Although there is no requirement in SGML or ISO-HTML to place the
value of the TITLE
attribute on a single line, we encourage authors
to do this, to facilitate the use of popular browsers while they move
towards fuller conformance.
The start tag is required but the end tag is optional. We recommend that authors include the end tag if the document is to be the subject of further processing.
In order to facilitate the preparation of conforming ISO-HTML
documents, the User's Guide provides a stricter definition for the content
model of the <BODY>
[W3C 7.5.1] element.
<!ELEMENT BODY - O ((%block;)*,(H1,DIV1)*) +(DEL|INS) >
This content model makes use of the element <DIV1>
, which is not a
part of ISO-HTML, to enforce strictly progressive nesting of sections.
The <DIV1>
tags generated during the preparation process will be
removed after the document has been validated as conforming to the
strict nesting requirement.
NOTE: Authors are not required to place <DIV1>
tags in
documents; they are deduced automatically by the SGML parser.
The International Standard requires that the <BUTTON>
[W3C 17.5] element not contain the <A>
[W3C 12.2],
<BUTTON>
[W3C 17.5], <FIELDSET>
[W3C 17.10], <FORM>
[W3C 17.3], <INPUT>
[W3C 17.4], <LABEL>
[W3C 17.9.1], <SELECT>
[W3C 17.6] or
<TEXTAREA>
[W3C 17.7] elements. If the <BUTTON>
[W3C 17.5] element contains an <IMG>
[W3C 13.2]
element, the International Standard requires that the <IMG>
[W3C 13.2] not have an ISMAP
or
USEMAP
attribute.
The attributes of the <BUTTON>
[W3C 17.5] element are restricted to:
CLASS
, ID
and TITLE
.
DIR
and LANG
.
ACCESSKEY
, DISABLED
NAME
This attribute is required if the TYPE
attribute has the
value submit
.
TABINDEX
TYPE
Specifies the behaviour associated with the button and takes one of the following values:
TYPE
=reset
If the <BUTTON>
[W3C 17.5] is contained in a <FIELDSET>
[W3C 17.10], the reset action is
limited to the contents of the <FIELDSET>
[W3C 17.10].
TYPE
=submit
VALUE
This attribute is required if the TYPE
attribute has the
value submit
, and specifies the value to be returned if
the button is selected.
ISO-HTML requires that the TYPE
attribute be provided, and
when the TYPE
is specified as submit
, the
NAME
and VALUE
attributes shall be provided.
The International Standard restricts the attributes of the <COL>
[W3C 11.2.4] element to:
The International Standard restricts the attributes of the <COLGROUP>
[W3C 11.2.4] element to:
The SPAN
attribute should only be used if the <COLGROUP>
[W3C 11.2.4]
element has no content.
The header of a document provides information about the document rather than the content of the document. Such meta-information is potentially very important for libraries and applications based on large document collections. We recommended that authors give careful attention to their document headers as part of the overall architecture and design of their applications.
The start tag of the <HEAD>
[W3C 7.4.1] element is required by ISO-HTML and shall
not be omitted.
Scripting is not yet considered to be sufficiently stable and mature
to be included in an International Standard, so the <HEAD>
[W3C 7.4.1] element
content model does not include the <SCRIPT>
[W3C 18.2.1] element.
In SGML vocabulary, the element which contains the document instance
is known as the document element [8879 4.99 and 7.2]. Many historic
HTML documents omitted the document element tags, and the W3C Recommendation for HTML
4.01,
in an effort at backward compatibility, continues to allow omission of
the document element start and end tags. ISO-HTML has no backward
compatibility requirement, and requires that both the start and end
tags of the <HTML>
[W3C 7.3] element be present. They shall not be omitted.
This User's Guide provides a specification for an "HTML
in preparation" document which facilitates validation. Since the
preparation documents are technically not ISO-HTML, their document
element is changed to <Pre-HTML>
to avoid any possible confusion.
The structural elements BODY
, H1
,
P
, ... were invented in the late 60s and have re-appeared
in many SGML-based markup languages since. An historic example is the
general document DTD (GDOC) [SGML Annex E.1]. The notion of
sectioning that the elements provide is most clear in the industrial
strength DocBook DTD where a
chapter
corresponds to the BODY
of an HTML
page. In DocBook, a typical chapter is
<chapter><title>My Chapter</title>
<para> ... </para>
<sect1><title>First section</title>
<para> ... </para>
<example> ... </example>
</sect1>
</chapter>
There are three ideas here:
sect
n in DocBook.
sect
n in DocBook.
title
in
DocBook.
A document designer, when creating a DTD, needs to have at least two elements which represent these three ideas in order to fully structure the document. DocBook has choosen elements to represent the nested section and the text of the title. HTML has only one element which represents the text of the title.
The following table shows the correspondance with GDOC, HTML and Pre-HTML:
DocBook | GDOC | HTML | Pre-HTML |
---|---|---|---|
chapter | h0 | missing | missing |
sect1 | h1 | missing | DIV1 |
sect2 | h2 | missing | DIV2 |
para | p | P | P |
title | h0t, ..., h3t | H1, ..., H6 | H1, ..., H6 |
HTML appears to put H1, ..., H6 in the "wrong" place, confusing the text of a title with the beginning of a new nested section.
ISO-HTML considers that the H1, H2,... of HTML still identify sections even though they contain only the section title. The "H1 section" exists up to the next H1 or the end of the body.
The <DIV>
[W3C 7.5.4] element in HTML does not have the same nested section
semantics as DocBook's sect
n. This is why
ISO-HTML, which is very strict about document structuring, does not
allow <DIV>
[W3C 7.5.4] to be intermixed with nested sections.
ISO-HTML takes a very strict view of the nesting of sections.
Sections are considered to be important building blocks in documents,
and maintaining the integrity of their relationships is considered
vital. ISO-HTML considers that the <H1>
[W3C 7.5.5] element specifies the
beginning of a major section of a document and contains the title of
that major section. In the past, many authors have used section
header elements only for their appearance, typically giving the author
a set of larger fonts with a visual browser. The W3C offer the
following light deprecation of this usage:
Some people consider skipping heading levels to be bad practice
but accept headings in any order, in an effort to promote backward compatibility.
ISO-HTML considers that the <H1>
[W3C 7.5.5] through <H6>
[W3C 7.5.5] elements identify
sections of increasing depth and requires that the trees formed by the
containment of sections be rooted at the <H1>
[W3C 7.5.5] element, and that no
intermediate level be skipped.
The International Standard requires that the <H1>
[W3C 7.5.5] element not be followed by an
<H3>
[W3C 7.5.5], <H4>
[W3C 7.5.5], <H5>
[W3C 7.5.5], or <H6>
[W3C 7.5.5] element without an intervening <H2>
[W3C 7.5.5]
element. This requirement is expressed as normative text in the DTD,
but cannot be specified in the DTD content models without introducing
additional elements which are not a part of the language. It is
possible to make the introduction of new elements entirely automatic,
without them appearing in the source document, but the use of general
purpose SGML tools such as sgmlnorm
which parse documents
and re-issue then with all start and end tags included poses a problem
since these "normalized" documents are not valid ISO-HTML.
The attributes of the <H1>
[W3C 7.5.5] element are restricted to:
To make it possible for an SGML parser to validate the correct nesting
of sections, this User's Guide provides an "almost ISO-HTML" document type
definition which may be used to facilitate preparation of valid
ISO-HTML. The document element of this "preparation ISO-HTML" has
been changed from <HTML>
[W3C 7.3] to <Pre-HTML>
to avoid any confusion. The
<Pre-HTML>
DTD automatically introduces new elements required for the
validation process. A simple program or a procedure based on architectural forms may be used later to remove
the unwanted elements to produce valid ISO-HTML.
The ISO-HTML DTD may be switched to the <Pre-HTML>
DTD through use of
the Preparation
parameter entity. If Preparation
has the value INCLUDE
, the alternate definition of the <H1>
[W3C 7.5.5]
element requires the correct nesting of headings and sections:
<!ELEMENT H1 - - (%text;)+ >
<!ELEMENT DIV1 O O ((%block;)*,(H2,DIV2)*) >
For further details, see SGML engineering.
The recommended way of specifying that the <Pre-HTML>
DTD is to be
used is by preceeding the document instance with the ISO-HTML preparation document type declaration.
This has the effect of setting the Preparation
parameter
entity to the value INCLUDE
.
The <DIV1>
through <DIV6>
elements are for internal use only within
the DTD and are not a part of the language. They shall not appear in
any ISO-HTML document or associated style sheet.
ISO-HTML considers that the <H2>
[W3C 7.5.5] element specifies the beginning of a
section of a document and contains the title of that section.
The International Standard requires that the <H2>
[W3C 7.5.5] element not be followed by an
<H4>
[W3C 7.5.5], <H5>
[W3C 7.5.5], or <H6>
[W3C 7.5.5] element without an intervening <H3>
[W3C 7.5.5] element.
An <H2>
[W3C 7.5.5] element shall be preceded by an <H1>
[W3C 7.5.5] element.
The attributes of the <H2>
[W3C 7.5.5] element are restricted to:
ISO-HTML considers that the <H3>
[W3C 7.5.5] element specifies the beginning of a
subsection of a document and contains the title of the subsection.
The <H3>
[W3C 7.5.5] element shall not be followed by an <H5>
[W3C 7.5.5] or <H6>
[W3C 7.5.5] element
without an intervening <H4>
[W3C 7.5.5] element. An <H3>
[W3C 7.5.5] element shall be
preceded by an <H2>
[W3C 7.5.5] element.
The attributes of the <H3>
[W3C 7.5.5] element are restricted to:
ISO-HTML considers that the <H4>
[W3C 7.5.5] element specifies the beginning of a
subsubsection of a document and contains the title of the
subsubsection.
The <H4>
[W3C 7.5.5] element shall not be followed by an <H6>
[W3C 7.5.5] element without an
intervening <H5>
[W3C 7.5.5] element. An <H4>
[W3C 7.5.5] element shall be preceded by an
<H3>
[W3C 7.5.5] element.
The attributes of the <H4>
[W3C 7.5.5] element are restricted to:
ISO-HTML considers that the <H5>
[W3C 7.5.5] element specifies the beginning of a
subsubsubsection of a document and contains the title of the
subsubsubsection.
An <H5>
[W3C 7.5.5] element shall be preceded by an <H4>
[W3C 7.5.5] element.
The attributes of the <H5>
[W3C 7.5.5] element are restricted to:
ISO-HTML considers that the <H6>
[W3C 7.5.5] element specifies the beginning of a
minor subsubsubsection of a document and contains the title of the
minor subsubsubsection.
An <H6>
[W3C 7.5.5] element shall be preceded by an <H5>
[W3C 7.5.5] element.
The attributes of the <H6>
[W3C 7.5.5] element are restricted to:
The attributes of the <IMG>
[W3C 13.2] element are restricted to:
CLASS
, ID
and TITLE
.DIR
and LANG
.
ALT
ISMAP
The International Standard requires that if the ISMAP
attribute is present in
an <IMG>
[W3C 13.2] element, the <IMG>
[W3C 13.2] element shall be contained in an <A>
[W3C 12.2]
element with an HREF
attribute present.
LONGDESC
, SRC
, USEMAP
.The International Standard requires that the SRC
and ALT
attributes be provided.
At most one of the attributes ISMAP
and USEMAP
may be provided.
The TYPE
attribute of the <INPUT>
[W3C 17.4] element discriminates
between several different types of input field. The set of applicable
attributes depends on the value of the TYPE
attribute as
specified in the following subchapters. By default the value of the
TYPE
attribute is "text
".
The value "button
" for the attribute TYPE
is not
available in ISO-HTML. Authors wishing to place button-like
devices in documents should use the <BUTTON>
[W3C 17.5] element.
For all values of the TYPE
attribute, the <INPUT>
[W3C 17.4] element carries the
following attributes:
ISO-HTML restricts the other attributes of the <INPUT>
[W3C 17.4] element to
ACCEPT
, ACCESSKEY
, CHECKED
, DISABLED
, MAXLENGTH
,
NAME
, READONLY
, SIZE
, TABINDEX
, TYPE
and
VALUE
. Their use depends on the value of the TYPE
attribute as specified in the following subchapters.
Pairs of NAME
, VALUE
attributes are known as
controls and are described in clause 17.2 Controls. When they
are submitted for processing they are known as successful
controls and described in clause 17.13.2
Successful controls in the W3C Recommendation for HTML
4.01.
For some values of attribute TYPE
, the attribute TABINDEX
is available: its value is a non-negative integer. An SGML number
[8879 9.3] is merely a token in which the characters are
restricted to digits. 14 and 00014 are not the same number/token
since the character strings are not the same. ISO-HTML recommends
that the number be given an integer interpretation, with leading
zeroes ignored, in the manner of a programming language.
TYPE
=checkbox
An <INPUT>
[W3C 17.4] element with TYPE
=checkbox
specifies
a boolean choice. A set of <INPUT>
[W3C 17.4] elements in the same <FORM>
[W3C 17.3]
element with the same NAME
attribute value represents an
n-of-many choice.
The other attribute values are as follows:
ACCESSKEY
, CHECKED
, DISABLED
.NAME
This attribute is required.
TABINDEX
VALUE
This attribute is required.
TYPE
=file
An <INPUT>
[W3C 17.4] element with TYPE
=file
provides a
means for users to attach a file to a form's content. The <INPUT>
[W3C 17.4] is
typically structured within a <FIELDSET>
[W3C 17.10] containing text and an
associated <BUTTON>
[W3C 17.5] which when selected invokes a file browser to
select a file name. The file name can also be entered directly in the
text field. See RFC1867 for further details.
It is important that a user agent not send any file that the user has not explicitly authorized to be sent. Thus ISO-HTML interpreting agents are expected to confirm any default file names that might be suggested. ISO-HTML requires that fields specifying files not be hidden.
The other attribute values are as follows:
TYPE
=hidden
An <INPUT>
[W3C 17.4] element with TYPE
=hidden
declares
that a field should not be rendered—it is hidden from the user.
The user does not interact with the field; instead, the VALUE
attribute specifies the value of the field. The NAME
and
VALUE
attributes are required, and are returned to the server
when the form is submitted.
This input element may be used to provide state information in a form.
The other attribute values are as follows:
TYPE
=password
An <INPUT>
[W3C 17.4] element with TYPE
=password
specifies
a single line text field into which users may type a password. As the
user types, the characters are usually echoed as `*
' to
hide the password from prying eyes.
Application designers should note that this is only a light security
protection. Although the password is masked by the browser from
casual observers, it may be transmitted back to the server in clear
text, and can be read by anyone with low-level access to the network.
It is possible to specify encryption using the
ACTION
attribute of <FORM>
[W3C 17.3] however details are beyond the scope of
the Guide.
The other attribute values are as follows:
TYPE
=radio
An <INPUT>
[W3C 17.4] element with TYPE
=radio
specifies a
boolean choice: "on" or "off". A set of <INPUT>
[W3C 17.4] elements in a
<FORM>
[W3C 17.3] element with the same NAME
attribute value
collectively represents a 1-of-many choice. Only one is "on", and all
the others are "off".
The other attribute values are as follows:
ACCESSKEY
, CHECKED
, DISABLED
.NAME
This attribute is required.
TABINDEX
VALUE
This attribute is required.
ISO-HTML requires that at all times one and only one of the radio
buttons in a set be checked. Initially, if none of the <INPUT>
[W3C 17.4]
elements in a set of radio buttons specifies CHECKED
, then the user
agent shall mark the first radio button of the set as checked.
TYPE
=reset
An <INPUT>
[W3C 17.4] element with TYPE
=reset
specifies an
input option, usually represented by a button, that instructs a user
agent to reset the form's fields to their initial states.
This behaviour is also offered by the <BUTTON>
[W3C 17.5] element which should
be preferred.
There is an inconsistency between the behaviour of the <BUTTON>
[W3C 17.5]
element type with attribute TYPE
=reset
when
contained in a <FIELDSET>
[W3C 17.10], and the behaviour of the <INPUT>
[W3C 17.4] element
type with attribute TYPE
=reset
when contained in
a <FIELDSET>
[W3C 17.10].
In the case of <BUTTON>
[W3C 17.5], the reset action is limited to the contents
of the <FIELDSET>
[W3C 17.10], but in the case of <INPUT>
[W3C 17.4], the International Standard omits to
state the limitation. See reported defect 8.
We recommend that authors and application designers assume that the
same limitation exists for <BUTTON>
[W3C 17.5] and <INPUT>
[W3C 17.4].
The other attribute values are as follows:
TYPE
=submit
An <INPUT>
[W3C 17.4] element with TYPE
=submit
represents
an input option, typically a button, that instructs a user agent to
submit the form.
This behaviour is also offered by the <BUTTON>
[W3C 17.5] element which should
be preferred.
The other attribute values are as follows:
TYPE
=text
An <INPUT>
[W3C 17.4] element with TYPE
=text
specifies a
single line text field into which users may type a string.
The other attribute values are as follows:
ACCESSKEY
, DISABLED
, MAXLENGTH
.NAME
This attribute is required.
READONLY
, SIZE
, TABINDEX
.VALUE
This attribute is required.
The International Standard requires that the <LABEL>
[W3C 17.9.1] element refer to a form field in
the content of the <FORM>
[W3C 17.3] element which contains the <LABEL>
[W3C 17.9.1].
ISO-HTML restricts the attributes of the <LINK>
[W3C 12.3] element to:
CLASS
, ID
and TITLE
.DIR
and LANG
.CHARSET
HREF
HREFLANG
, MEDIA
REL
The REL
attribute defines the relationship of the
target anchor to the source anchor.
REV
The REV
attribute defines the relationship of the
source anchor to the target anchor. The same generally recognized
values are available for the REV
attribute as for the REL
attribute, but the semantics are reversed for a given link. For
example:
REV
=contents
The current document serves as a table of contents for the document refered to by the link.
TYPE
In this example the current document is "Chapter2.html", and the links describe the relationships with the preceding and following chapters:
<HEAD>
<LINK REL="Index" HREF="../index.html">
<LINK REL="Next" HREF="Chapter3.html">
<LINK REV="Previous" HREF="Chapter3.html">
<LINK REV="Next" HREF="Chapter1.html">
</HEAD>
If the HREF
is unchanged, changing REL
to REV
or vice
versa requires reversing the semantics of the REL
/REV
attribute.
The International Standard requires that the NAME
attribute be provided.
In order to resolve the ID
/NAME
case folding
contradiction, we recommend that authors satisfy the competing
requirements of SGML and the W3C Recommendation for HTML
4.01 by restricting themselves to
the 40 characters "ABCDEFGHIJKLMNOPQRSTUVWXYZ.-_:0123456789" for ID
and NAME
values, and for the corresponding HREF
values.
In SGML terms, the attribute value specification shall be processed as if the declared value were NAME.
Entity references and character references are replaced, entity ends and record starts are removed, record end and separator characters are replaced by a space. Any sequence of space characters is replaced by a single space and leading and trailing spaces are deleted, [8879 7.9.3 and 10.1.7].
The
International Standard recommends that authors of ISO-HTML documents use both the
ID
attribute and the NAME
attribute. If both are used, then
they shall be given identical values since this
allows an SGML parser to verify that the values for different anchors
are distinct.
The first edition of the International Standard provided only <AREA>
[W3C 13.6.1] elements to
specify the shape of the map. These are essentially graphic and are
not suitable for sight impaired or blind users. See defect 4.
The W3C Recommendation for HTML
4.01 extends the content model of the <MAP>
[W3C 13.6.1] element type to
include block
elements as well as <AREA>
[W3C 13.6.1] elements. The block
elements provide a richer means of describing the map areas, allowing
alternative descriptions of the areas suitable for speech browsers.
They are intended to improve accessibility, and the International Standard recommends
that they be used by authors and rendered by browsers. Although the
International Standard expresses the requirement as a recommendation by the
use of the word "should", the use of block level content
should be understood as a strict requirement.
Authors should use the block-level content of the
<MAP>
[W3C 13.6.1] element when creating accessible documents. Each region should be specified using an<A>
[W3C 12.2] element to define its associated link and shape. User agents should render the block-level content of a<MAP>
[W3C 13.6.1] element.
Here is an example of a national accessibility requirement.
The following example shows the use of block-level content to describe five polygons placed in a figure. Each polygon is inscribed in a circle radius R. Selecting one of the polygons leads to a formula for the surface area S.
If the circle has radius R, then the surface area S of the inscribed polygon is:
Triangle: S = (3 * R**2 * sqrt(3)) / 4
Hexagon: S = (3 * R**2 * sqrt(3)) / 2
Decagon: S = (5 * R**2 * sqrt(10 - 2 * sqrt(5))) / 4
The markup used in this example is as follows. It includes an <AREA>
[W3C 13.6.1]
specification of the selectable areas for browsers which cannot handle
block content in a <MAP>
[W3C 13.6.1].
<!-- This map describes a 632x128 pixel
drawing of five polygons. -->
<map id="POLYGONMAP" name="POLYGONMAP">
<p>
<a href="#TRIANGLE"
shape="rect"
coords=" 0,0, 125,127">Triangle</a> or
<a href="#SQUARE"
shape="rect"
coords="126,0, 251,127">square</a> or
<a href="#HEXAGON"
shape="rect"
coords="252,0, 377,127">hexagon</a> or
<a href="#DECAGON"
shape="rect"
coords="378,0, 503,127">decagon</a> or
<a href="#DUODECAGON"
shape="rect"
coords="252,0, 631,127">duodecagon</a>.
<!-- Markup for browsers which cannot
handle block content in MAP -->
<area href="#TRIANGLE"
shape="rect" coords=" 0,0, 125,127"
alt="Triangle inscribed in a circle">
<area href="#SQUARE"
shape="rect" coords="126,0, 251,127"
alt="Square inscribed in a circle">
<area href="#HEXAGON"
shape="rect" coords="252,0, 377,127"
alt="Hexagon inscribed in a circle">
<area href="#DECAGON"
shape="rect" coords="378,0, 503,127"
alt="Decagon inscribed in a circle">
<area href="#DUODECAGON"
shape="rect" coords="252,0, 631,127"
alt="Duodecagon inscribed in a circle">
</map>
<!-- Offer the visitor a choice of polygon. -->
<p>
<img src="polygon.png"
class="fullwidth"
alt="Five regular polygons each inscribed in a circle"
title="Choose a polygon"
usemap="#POLYGONMAP">
The attributes of the <OBJECT>
[W3C 13.3] element are restricted to:
CLASS
, ID
and TITLE
.
The ID
attribute is also available to assist inter
agent communication.
DIR
and LANG
.CLASSID
, CODEBASE
, CODETYPE
, DATA
, DECLARE
,
NAME
, STANDBY
, TABINDEX
, TYPE
, USEMAP
.The contents of the <Q>
[W3C 9.2.2] element shall not be surrounded with
quotation marks. These may be added by the user agent through the use
of a style sheet.
A <Q LANG=de>quotation in German</Q> and
a <Q LANG=fr>quotation in French</Q>.
might be rendered as:
A ,,quotation in German'' and a << quotation in French >>.
The <STYLE>
[W3C 14.2.3] element contains style sheet information which shall be
passed to the user agent's style manager. Any style sheet language
may be used, and none is defined by the International Standard.
It is a user agent error to render the style sheet information as if it were part of a document's text.
We recommend that authors:
<META>
[W3C 7.4.4] element.The attributes of the <TABLE>
[W3C 11.2.1] element are restricted to:
CLASS
, ID
and TITLE
.DIR
and LANG
.SUMMARY
The SUMMARY
attribute is required by the International Standard and
shall be provided.
In ISO-HTML the start tag is required for the <TBODY>
[W3C 11.2.3] element.
The attributes of the <TD>
[W3C 11.2.6] element are restricted to:
CLASS
, ID
and TITLE
.DIR
and LANG
.ABBR
, AXIS
, COLSPAN
, HEADERS
, ROWSPAN
and SCOPE
.The attributes of the <TH>
[W3C 11.2.6] element are restricted to:
CLASS
, ID
and TITLE
.DIR
and LANG
.ABBR
, AXIS
, COLSPAN
, HEADERS
, ROWSPAN
and SCOPE
. It is recommended that authors pay attention to the following points in order to avoid inconsistent rendering of their tables.
The <TR>
[W3C 11.2.5] element should require exactly the same number of columns as
the number of columns specified by the <COL>
[W3C 11.2.4] or <COLGROUP>
[W3C 11.2.4] elements
in the containing <TABLE>
[W3C 11.2.1] element, if present, taking into account
the effect of the ROWSPAN
and COLSPAN
attributes of the <TD>
[W3C 11.2.6]
and <TH>
[W3C 11.2.6] elements, the SPAN
attributes of the <COL>
[W3C 11.2.4] and
<COLGROUP>
[W3C 11.2.4] elements and the padding of incomplete rows by a user
agent.
The attributes of the <TR>
[W3C 11.2.5] element are restricted to:
This chapter describes an SGML-based process for preparing ISO-HTML conforming documents. The process is not a part of the International Standard, but is intended to make it easier to conform to the International Standard. The principal advantages are:
NOTE: This is the technique used to specify the many links between the User's Guide and the W3C Recommendation for HTML 4.01.
More complex SGML-based processes are possible. For example, the source document may be structured using a richer DTD or a richly structured document database. This has advantages when a document represents a major investment and is used to generate a range of output. The processing of such documents is beyond the scope of this User's Guide.
The process uses the document type declaration internal subset
[8879 11.1] which is a feature of the DOCTYPE
declaration not supported by the International Standard or the W3C Recommendation for HTML
4.01. In order to
clearly identify the documents-in-preparation as being different from
ISO-HTML or HTML 4, we give them a different document element
<Pre-HTML>
. This document element is only valid for
documents-in-preparation.
The internal subset appears between square brackets in the
DOCTYPE
declaration as shown in the following figure.
Before describing the contents of the figure, a short discussion of entities in SGML may be useful. An SGML entity [8879 B.6] may be thought of as a chunk of document — a programmer might prefer to use the term macro. There are two types of entity in SGML:
Parameter entities are also referenced in a document instance in the status keyword specification of marked sections where they provide the keywords INCLUDE or IGNORE for optional sections of text..
à
used to provide a lower case a with a grave
accent which appears at the end of the word voilà.
NOTE: The two types of entity serve the same basic purpose. The reason for having two types is to have two name spaces. The document author need not be concerned about overloading an entity name already chosen by the support people who define the document type declaration.
The "legal
" ENTITY declaration [8879 10.5] in the
subset has a %
character before the entity name. This
indicates that "legal
" is a parameter entity for
use in the subset. The notation "%legal;
"
[8879 9.4.4] is a reference to the parameter entity and in the
example shown, an SGML parser will resolve the
parameter entity to a declaration of the general entity
&fineprint;
which may be used in the document. The
resolution process is indirect: an OASIS catalogue fragment, usually in a file
"catalog
", points to the file which contains the general
entity definition. The lookup is done using the Formal Public
Identifier [8879 10.2], in the example given:
"-//Whiz-Bang//TEXT Legal//EN
". The result is to make
the general entity &fineprint;
available for use in
the document.
At first sight this process may seem complex, but in a large
production environment it has many advantages. The document author
can work without having to be concerned about which file contains the
latest fine print. The system administrator manages the OASIS catalogue and
the legal department can work independently on their fine print. We
have shown an external file "fineprint.txt
" which
contains only one general entity declaration. In practice the
external file may contain hundreds of entity declarations, for example,
the offical list of all the publicly avalaible URLs and URI's offered by a
corporation.
The ISO-HTML page produced by the process does not contain an internal
subset or any indication of the existence of the parameter entity
%legal;
or the general entity
&fineprint;
NOTE: The catalogue fragment may be in the same "catalog
"
file as the OASIS catalogue fragment described in "SGML engineering" and the sample SGML catalog fragment
provided by the W3C Recommendation for HTML
4.01.
There are two preparation processes, both using the
sgmlnorm
feature of the SP parser
to produce a version of the document-in-preparation in which
In this process, the intermediate document produced by
sgmlnorm
contains the <DIV1>
... <DIV6>
element tags
which are not permitted in ISO-HTML. They are removed by a "scrubber"
which also replaces <Pre-HTML>
start and end tags by <HTML>
[W3C 7.3] start and
end tags. In addition, the scrubber places the ISO-HTML document type declaration at the head of the file.
This was the process initially used by the editors. The incantation for the International Standard was of the form:
sgmlnorm -e -g -w all -E 5 15445.Pre-HTML | scrubber > 15445.html
In this process, there is no intermediate document. We use the DTD for
ISO-HTML as an architectural form to which the output of
sgmlnorm
is to conform. Since the <DIV1>
... <DIV6>
element tags are not a part of ISO-HTML, they are ignored and do not
appear in the output. In order to set up the process, we place the
following declaration in the internal subset:
<!-- Use ISO-HTML as architectural form -->
<!ENTITY % HtmlDtd PUBLIC "ISO/IEC 15445:2000//DTD HTML//EN">
<?IS10744 ArcBase HTML>
<!NOTATION HTML PUBLIC
"-//ISO-HTML User's Guide//NOTATION HTML Architecture//EN">
<!ATTLIST #NOTATION HTML
ArcDTD CDATA #FIXED "%HtmlDtd" -- Meta-DTD entity --
ArcDocF NAME #FIXED "HTML" -- Document element name --
ArcNamrA NAME #IMPLIED -- Default: no renaming --
-- See [HyTime A.3.4.2] --
>
The incantation which produces the International Standard now takes the form:
sgmlnorm -A html -d -e -g -w all -E 5 15445.Pre-HTML > 15445.html
where the "-A html" specifies use of notation HTML as a meta-DTD, and
the option "-d" asks sgmlnorm
to place the document type declaration for the metaDTD,
ie. ISO-HTML, at the top of the output document instance.
It is common to see a time stamp at the foot of an HTML page such as
that of the Free Software Foundation:
Updated: 1 Jan 1998 rms
. It is possible to use the
<Pre-HTML>
techniques to set this time stamp automatically. We will
assume that you are using a Makefile
to build your pages.
Makefile
, just before you
parse a page in which you wish to place a time stamp, insert the
following shell commands:
echo "<!ENTITY lastchange '" > lastchange
date >> lastchange
echo "' >" >> lastchange
NOTE: The three lines are indented with a tab, not spaces.
<!ENTITY % lastchange PUBLIC
"-//ISO-HTML User's Guide//TEXT Last change time stamp//EN" >
%lastchange;
-- Last change time stamp --
PUBLIC "-//ISO-HTML User's Guide//TEXT Last change time stamp//EN" lastchange
<hr>
<p>Last change was on &lastchange;
<hr>
You can adapt the formal public identifiers, entity names and time stamp text to your own needs.
NOTE: The parameter entity, the general entity and the temporary file which contains the time stamp all have the same name but since they are in different name spaces there is no ambiguity.
Authors are often interesting in having a single document which describes something which has options, levels, releases or variations. That is, some part of the content is to be included only if a description of the "version 2.11" is needed, or if the reader has the required reading authority. The author would like to be able to specify to the SGML production process which parts of the document are to be included.
This is easy to do if the source file is marked up using the Pre-HTML DTD for documents-in-preparation. To include or exclude text, we use SGML marked sections [8879 10.4] managed from the document type declaration internal subset [8879 11.1] which is available in Pre-HTML.
The first version of some product was "easy to use", but following urgent safety improvements, the new version is "easy and safe to use". We handle this as follows:
easy <![ %version2; [and <em>safe</em>]]> to use
<!ENTITY % version2 PUBLIC
"-//WhizzBang//TEXT Include version 2//EN">
-- Version 2 inclusion flag --
PUBLIC "-//WhizzBang//TEXT Include version 2//EN"
includeV2
Makefile
entry for version 1, we add the
declaration
echo "IGNORE" > includeV2
Makefile
entry for version 2, we add the
declaration
echo "INCLUDE" > includeV2
An alternative, more direct process includes or excludes text using
the -i
option of sgmlnorm
easy <![ %version2; [and <em>safe</em>]]> to use
<!ENTITY % version2 "IGNORE" >
Makefile
entry for version 1, we make no
reference to version2
sgmlnorm .....
Since parameter entity version2
has the value
"IGNORE"
, the extra words are omitted.
Makefile
entry for version 2, we add a
-i
option to the call of sgmlnorm
sgmlnorm -i version2 ....
This has the effect of setting parameter entity version2
to "INCLUDE"
, and including the extra words.
This chapter describes the SGML techniques that are used in the formal specification of ISO-HTML and Pre-HTML. Validating systems are required to support these techniques, but conforming systems are not.
The engineering is based on a three step process:
IGNORE
for the
%Preparation;
parameter entity which manages the
customization of the DTD. This default value is overridden by
Pre-HTML documents which specify the value INCLUDE
for
the %Preparation;
parameter entity.
%Preparation;
parameter entity.The DOCTYPE declarations [8879 11.1] for ISO-HTML are:
<!DOCTYPE HTML PUBLIC "ISO/IEC 15445:2000//DTD HyperText Markup Language//EN">
<!DOCTYPE HTML PUBLIC "ISO/IEC 15445:2000//DTD HTML//EN">
and the declaration for Pre-HTML is
<!DOCTYPE Pre-HTML PUBLIC
"-//ISO-HTML User's Guide//DTD Preparation of ISO-HTML//EN"
[<!ENTITY % Preparation "INCLUDE">
general entity declarations...
]>
The formal public identifiers (FPI) [8879 10.2] in the DOCTYPE
declarations for ISO-HTML and Pre-HTML are used as keys to identify
the corresponding entries in a catalogue
which is usually placed in a file named catalog
. The
catalogue associates the same file name with the three FPIs:
PUBLIC "ISO/IEC 15445:2000//DTD HyperText Markup Language//EN" 15445.dtd
PUBLIC "ISO/IEC 15445:2000//DTD HTML//EN" 15445.dtd
PUBLIC "-//ISO-HTML User's Guide//DTD ISO-HTML Preparation//EN" 15445.dtd
Parsers such as SP which support use of a catalogue use the FPIs to find the name of the file containing the DTD.
NOTE: The file name is system dependent. A different name may be needed on restricted operating systems.
The Pre-HTML internal subset contains a declaration for the parameter
entity %Preparation;
for which the default value is defined
in the ISO-HTML DTD. The value in the Pre-HTML internal subset takes
precedence over the default value provided in the ISO-HTML DTD (see
[8879 9.4.4.1]).
If %Preparation;
has the value INCLUDE
,
the mechanisms which require correct nesting of the elements <H1>
[W3C 7.5.5]
through <H6>
[W3C 7.5.5] are included. If the value is IGNORE
, the
mechanisms which require correct nesting of headings are omitted.
The ISO-HTML DTD defines the inverse parameter entity
%NoPreparation;
. The result is that the DTD specifies the
following parameter entities:
%Preparation; = IGNORE
,
%NoPreparation; = INCLUDE
%Preparation; = INCLUDE
,
%NoPreparation; = IGNORE
The SGML parser parses the files making up the formal definition of
ISO-HTML, taking into account the values of the
%Preparation;
and %NoPreparation;
parameter
entities specified in step 2. The parameter entities control the
inclusion or exclusion of marked sections, see [8879 10.4], thus
changing the formal definitions.
A typical effect of the parameter entity is in the modification of the
element <BODY>
[W3C 7.5.1].
<![ %Preparation;
[
<!ELEMENT BODY - O ((%block;)*, (H1,DIV1)* )
+(DEL|INS) >
]]>
<![ %NoPreparation;
[
<!ELEMENT BODY - O (%block;|H1|H2|H3|H4|H5|H6)+
+(DEL|INS) >
]]>
When the author's DOCTYPE declaration calls for ISO-HTML, this is the same as:
<!ELEMENT BODY - O (%block;|H1|H2|H3|H4|H5|H6)+
+(DEL|INS) >
but when the author's DOCTYPE declaration calls for Pre-HTML, this is the same as:
<!ELEMENT BODY - O ((%block;)*, (H1,DIV1)* )
+(DEL|INS) >
Before we begin the discussion of folding to upper case, we need to review some SGML vocabulary. Consider the following attribute definition list declaration [SGML 11.3]:
<!ATTLIST ...
LANG NAME #IMPLIED -- RFC1766 language value --
ID ID #IMPLIED -- Document-wide unique id --
HREF CDATA #IMPLIED -- Universal Resource Identifier, RFC1630 --
NAME CDATA #IMPLIED -- Target anchor --
>
Each of the four attribute definitions in the list consists of three parts. For example in the third attribute definition:
HREF
: The attribute name.
CDATA
: SGML calls this the declared value
[SGML 11.3.3]. A programmer might prefer to call it the type
of the attribute. There are fifteen types of attribute value
corresponding to the following SGML keywords: CDATA, ENTITY, ENTITIES,
ID, IDREF, IDREFS, NAME, NAMES, NMTOKEN, NMTOKENS, NUMBER, NUMBERS,
NUTOKEN, NUTOKENS and NOTATION. (There is also a name group
construction which does not concern us here.) Of these, only CDATA,
ID, IDREF, IDREFS, NAME and NUMBER appear in ISO-HTML.
#IMPLIED
: SGML calls this the default
value. The default value #IMPLIED
means that if this
attribute is omitted, it's up to the application to decide what to do.
Attributes with certain "declared values"/"types" have their values automatically folded to upper case in certain conditions. What are these conditions? They are given in the SGML declaration, in the section:
NAMING ...
NAMECASE GENERAL YES
ENTITY NO
The declaration NAMING ... NAMECASE GENERAL YES
means
that those syntactic items which are names, are to be folded to upper
case. For example, if an attribute has the type ID, IDREF, IDREFS,
NAME, NAMES, NMTOKEN, NMTOKENS, NUTOKEN or NUTOKENS then its value is
to be automatically folded [SGML 13.4.5]. Of these, only ID, IDREF,
IDREFS and NAME appear in ISO-HTML.
NOTE: We are talking about the types here, not the
attribute names. It is easy to confuse the attribute name NAME
and the type NAME.
The declaration NAMING ... NAMECASE ENTITY NO
means that
those syntactic items which are the names of entities, are not to be
folded to upper case. For example, if an attribute has the type
ENTITY or ENTITIES then its value is not automatically folded [SGML
13.4.5]. This situation does not occur in ISO-HTML, but could occur
in a Pre-HTML document.
You will see those ISO-HTML attributes whose values are folded to upper case by inspecting the ISO-HTML DTD and noting those which have a declared value/type of ID, IDREF, IDREFS or NAME.
In XHTML which is an application of XML, the SGML declaration becomes
NAMING ...
NAMECASE GENERAL NO
ENTITY NO
which removes all case folding. XHTML is case sensitive.
The following table summarizes the situation for anchors.
Attribute | Attribute type | Automatic |
---|---|---|
name | (declared value) | folding? |
ID
|
ID | Yes |
HREF
|
CDATA | No |
NAME
|
CDATA | No |
We suggest that you now read the clause in the W3C Recommendation for HTML 4.01 which discusses 12.2.3 Anchors with the id attribute. In summary:
ID
attribute may be used to create an anchor at the start
tag of an element.
ID
and NAME
attributes share the same name space.
<A>
[W3C 12.2], <FORM>
[W3C 17.3], <IMG>
[W3C 13.2]
and <MAP>
[W3C 13.6.1].
ID
and
NAME
must be the same when both appear in an element's start tag:
<P><A name="a1" id="a1" href="#a1">...</A>
Clearly there is a contradiction between the automatic folding of the
ID
, but not the NAME
and HREF
. The example suggests that
names are to be equal before folding, but the
equality test is applied after any folding.
The case folding behaviour of browsers and other tools is in general
undefined. The very useful tool HTML tidy,
which cleans up the broken HTML generated by many authoring tools,
checks that when attributes ID
and NAME
are used together on an
element, they have the same value. However this test is made without
any folding. As a result, if the values contain lower case
characters, and the document is later passed through the SP tool
sgmlnorm
, the document no longer satisfies HTML tidy,
even though from a strict SGML point of view, nothing has changed.
As far as case folding is concerned, the International Standard requires that
conforming documents satisfy the requirements of the W3C Recommendation for HTML
4.01 and
those of SGML, but without saying how. We recommend that authors
satisfy these requirements by restricting themselves to the 40
characters "ABCDEFGHIJKLMNOPQRSTUVWXYZ.-_:0123456789" for ID
and
NAME
values, and for the corresponding HREF
values.
NOTE: In the markup for the International Standard all the values of the ID
, and
NAME
attributes, and the corresponding HREF
values are written
in upper case. This allows the markup to pass through the SGML parser
sgmlnorm
and remain acceptable to HTML tidy.
[1] "Hypertext Markup Language - 2.0". T. Berners-Lee, D. Connolly. IETF RFC1866, November 1995. Category: Standards Track. http://www.ietf.org/rfc/rfc1866.txt
[2] "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types", N. Freed, N. Borenstein. IETF RFC2046, November 1996. Category: Standards Track. Obsoletes: 1521, 1522, 1590. http://www.ietf.org/rfc/rfc2046.txt
(normative in the International Standard)
The SGML declaration for ISO-HTML is provided by this file:
<!SGML "ISO 8879:1986 (WWW)" -- ISO/IEC 15445 Hypertext Markup Language (ISO-HTML) SGML Declaration Copyright (C) 2000 IETF, W3C (MIT, Inria, Keio), ISO/IEC All Rights Reserved Permission to copy in any form is granted for use with validating and conforming systems and applications as defined in ISO/IEC 15445, provided this copyright notice is included with all copies. -- CHARSET -- First 17 planes of ISO 10646. -- BASESET "ISO Registration Number 177//CHARSET ISO/IEC 10646-1:1993 UCS-4 with implementation level 3//ESC 2/5 2/15 4/6" DESCSET 0 9 UNUSED 9 2 9 11 2 UNUSED 13 1 13 14 18 UNUSED 32 95 32 127 1 UNUSED 128 32 UNUSED 160 55136 160 55296 2048 UNUSED 57344 1056768 57344 -- ISO/IEC 10646 does not define all positions. For example, it reserves positions with hexadecimal values 0000D800 - 0000DFFF, used in the UTF-16 encoding of UCS-4, as well as the last two code values in each plane of UCS-4, ie. all values of the hexadecimal form xxxxFFFE and xxxxFFFF. Undefined code values and the corresponding numeric character references should not be included in an HTML document, and they shall be ignored if encountered when processing an HTML document. -- CAPACITY SGMLREF TOTALCAP 150000 GRPCAP 150000 ENTCAP 150000 SCOPE DOCUMENT SYNTAX SHUNCHAR CONTROLS 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 127 BASESET "ISO 646IRV:1991//CHARSET International Reference Version (IRV)//ESC 2/8 4/2" DESCSET 0 128 0 FUNCTION RE 13 RS 10 SPACE 32 TAB SEPCHAR 9 -- Deprecated -- NAMING LCNMSTRT "" UCNMSTRT "" LCNMCHAR ".-_:" UCNMCHAR ".-_:" NAMECASE GENERAL YES ENTITY NO DELIM GENERAL SGMLREF HCRO "&#x" -- 38 is Ampersand -- SHORTREF SGMLREF NAMES SGMLREF QUANTITY SGMLREF ATTCNT 60 ATTSPLEN 65536 -- These are the largest values -- LITLEN 65536 -- permitted in the declaration. -- NAMELEN 65536 -- Avoid fixed limits in actual -- PILEN 65536 -- implementations of user agents. -- TAGLVL 100 TAGLEN 65536 GRPGTCNT 150 GRPCNT 64 FEATURES MINIMIZE DATATAG NO OMITTAG YES RANK NO SHORTTAG YES LINK SIMPLE NO IMPLICIT NO EXPLICIT NO OTHER CONCUR NO SUBDOC NO FORMAL YES APPINFO NONE >
(normative in the International Standard)
Part 1 of the DTD for ISO-HTML contains parameter entity definitions used in Parts 2 and 3, and the short reference mapping [8879 11.5] which converts the deprecated horizontal tab into a space. Part 2 contains the elements and their content models. Part 3 provides the attribute definitions and additional normative refinements that ISO-HTML places on the elements.
The document type definition (DTD) for ISO-HTML is provided by this file.
After the International Standard was published, it was discovered that there was a discrepancy between the W3C Recommendations and the ISO/IEC specification in the formal public identifier [8879 10.2] used to identify the set of entities defined by the W3C for the characters of ISO 8859-1 8-bit single-byte coded graphic character sets — Latin alphabet No. 1 commonly known as ``ISO latin 1''.
The formal public identifier used in the ISO/IEC DTD:
-//W3C//ENTITIES Full Latin 1//EN//HTML
contains a public text description [8879 10.2.2.2] ``Full
Latin 1
''. However the W3C recommendations had used
``Latin 1
'' and ``Latin1
''. Had the public
text description identified an ISO publication, then it would have
been created in accordance with the rule given by
[8879 10.2.2.2]:
It consists of the last element of the publication title, without the part number designation (if any).
If this rule had been applicable, then the public text descriptor
would have been ``Latin alphabet No. 1
'',
giving the formal public identifier
-//W3C//ENTITIES Latin alphabet No. 1//EN//HTML
However the ISO rule is not applicable to W3C publications.
The solution chosen is to consider all four public text descriptions to be valid and equivalent, which means that the four formal public identifiers:
-//W3C//ENTITIES Latin alphabet No. 1//EN//HTML
-//W3C//ENTITIES Full Latin 1//EN//HTML
-//W3C//ENTITIES Latin 1//EN//HTML
-//W3C//ENTITIES Latin1//EN//HTML
specify the same entity set.
The DTD defined in this clause references the entity set specified
by the W3C to define the characters of ISO 8859-1 8-bit
single-byte coded graphic character sets — Latin alphabet
No. 1. The reference uses a formal public identifier ``-//W3C//ENTITIES Full Latin 1//EN//HTML
'' which
contains the public text description ``Full
Latin 1
''. The public text descriptions ``Latin alphabet No. 1
'', ``Latin 1
'' and ``Latin1
'' are permitted alternatives which
describe the same entity set.
A similar situation arises for the reference by the DTD defined in
this clause to the entity set specified by the W3C for mathematical,
Greek and symbolic characters. The reference uses a formal public
identifier ``-//W3C//ENTITIES
Symbolic//EN//HTML
'' which contains the public text description
``Symbolic
''. However the W3C in HTML 4.01 subclause A.2.1 Errors that were corrected changed the
public text description to ``Symbols
''.
We recommend that system administrators use the same technique as used
for DTD identification to identify the entity
sets. The formal public identifiers (FPI) [8879 10.2] of the
entity sets are used as keys to identify the corresponding entries in
a catalogue which is usually placed in a
file named catalog
. The catalogue associates the same
file name with the equivalent FPIs:
PUBLIC "-//W3C//ENTITIES Latin alphabet No. 1//EN//HTML"
ISOlatin1.entities
PUBLIC "-//W3C//ENTITIES Full Latin 1//EN//HTML"
ISOlatin1.entities
PUBLIC "-//W3C//ENTITIES Latin 1//EN//HTML"
ISOlatin1.entities
PUBLIC "-//W3C//ENTITIES Latin1//EN//HTML"
ISOlatin1.entities
PUBLIC "-//W3C//ENTITIES Symbolic//EN//HTML"
Symbols.entities
PUBLIC "-//W3C//ENTITIES Symbols//EN//HTML"
Symbols.entities
PUBLIC "-//W3C//ENTITIES Special//EN//HTML"
Special.entities
NOTE: The file name is system dependent. A different name may be needed on restricted operating systems.
The DTD defined in this clause references the entity set specified
by the W3C to define mathematical, Greek and symbolic characters.
The reference uses a formal public identifier ``-//W3C//ENTITIES Symbolic//EN//HTML
'' which
contains the public text description ``Symbolic
''. The public text description
``Symbols
'' is a permitted alternative
which describes the same entity set.
NOTE: The User's Guide to this International Standard describes a way in which system administrators may allow simultaneous use of these alternatives.
<!-- 15445.dtd ISO/IEC 15445:2000 Hypertext Markup Language (HTML) Document Type Definition. Copyright (C) 2000-2003, IETF, W3C (MIT, Inria, Keio), ISO/IEC. All Rights Reserved. Permission to copy in any form is granted for use with validating and conforming systems and applications as defined in ISO/IEC 15445:2000, provided this copyright notice is included with all copies. The DTD is typically invoked by one of the following declarations: <!DOCTYPE HTML PUBLIC "ISO/IEC 15445:2000//DTD HyperText Markup Language//EN"> <!DOCTYPE HTML PUBLIC "ISO/IEC 15445:2000//DTD HTML//EN"> In order to use the HTML document type definition as a base architecture for other SGML applications, one of the following architectural support declarations should be used: <?IS10744 arch name="html" public-id="ISO/IEC 15445:2000//DTD HyperText Markup Language//EN" dtd-system-id="./15445.dtd" renamer-att="HTMLnames" doc-elem-form="HTML" > <!ENTITY % HtmlDtd PUBLIC "ISO/IEC 15445:2000//DTD HTML//EN"> <?IS10744 ArcBase HTML> <!NOTATION HTML PUBLIC "-//ISO-HTML User's Guide//NOTATION HTML Architecture//EN"> <!ATTLIST #NOTATION HTML ArcDTD CDATA #FIXED "%HtmlDtd" ArcDocF NAME #FIXED "HTML" ArcNamrA NAME #IMPLIED > --> <!-- Part 1 - Entity set --> <!-- The Preparation parameter entity shall be set to IGNORE for HTML, and to INCLUDE for a document to be submitted to the preparation process --> <!ENTITY % Preparation "IGNORE" > <!-- This definition generates the inverse entity NoPreparation which is internal to the DTD --> <![ %Preparation; [ <!ENTITY % NoPreparation "IGNORE" -- Inverse of Preparation = INCLUDE --> ]]> <!ENTITY % NoPreparation "INCLUDE" -- Inverse of Preparation = IGNORE --> <!-- End of definition --> <!-- Tokens defined by other standards --> <!ENTITY % Content-Type "CDATA" -- MIME content type, RFC1521 --> <!ENTITY % HTTP-Method "(get | post)" -- as per HTTP/1.1 RFC2068 --> <!ENTITY % URI "CDATA" -- Universal Resource Identifier, RFC1630 --> <!-- Element tokens --> <!ENTITY % special "A | BDO | BR | IMG | OBJECT | MAP | Q | SPAN" > <!-- Logical character styles --> <!ENTITY % logical.styles "ABBR | ACRONYM | CITE | CODE | DFN | EM | KBD | SAMP | STRONG | VAR" > <!-- Physical character styles --> <!ENTITY % physical.styles "B | I | SUB | SUP | TT" > <!-- Model groups --> <!-- Block-like elements eg. paragraphs and lists --> <!ENTITY % block "BLOCKQUOTE | DIV | DL | FIELDSET | FORM | HR | OL | P | PRE | TABLE | UL" > <!-- Form fields - input elements that should appear only within forms --> <!ENTITY % form.fields "BUTTON | INPUT | LABEL | SELECT | TEXTAREA" > <!-- Character level elements and text strings --> <!ENTITY % text "#PCDATA | %physical.styles; | %logical.styles; | %special; | %form.fields;" > <!-- Elements that may appear in a section or table --> <!ENTITY % section.content "(%block; | %text; | ADDRESS)+" > <!ENTITY % table.content "(%block; | %text;)*" > <!-- Generic attributes --> <!ENTITY % core "CLASS CDATA #IMPLIED -- Comma separated list of class values -- --The name space of the ID attribute is shared with the name space of the NAME attribute. Both ID and NAME attributes may be provided for the <A> and <MAP> elements. When both ID and NAME values are provided for an element, the values shall be identical. It is an error for an ID or NAME value to be associated with more than one element in a document. It is recommended that authors of documents specify both the ID attribute and the NAME attribute for the <A> and <MAP> elements. -- ID ID #IMPLIED -- Document-wide unique id -- TITLE CDATA #IMPLIED -- Advisory title or amplification --" > <!-- Internationalization attributes --> <!ENTITY % i18n "DIR (ltr|rtl) #IMPLIED -- Direction for weak/neutral text -- LANG NAME #IMPLIED -- RFC1766 language value --" > <!-- Presentation styles --> <!ENTITY % shape "(circle | default | poly | rect)" > <!ENTITY % InputType "(checkbox | file | hidden | password | radio | reset | submit | text)" > <!-- SHORTREF mapping for the tab character --> <!-- Use of the tab character is deprecated. However, to facilitate the preparation of conforming documents by authors who use it, the tab character is tolerated and is mapped into a single space. --> <!ENTITY nontab " " > <!SHORTREF tabmap " " nontab > <!USEMAP tabmap HTML > <!-- Specify character entity sets defined by W3C --> <!ENTITY % HTMLlat1 PUBLIC "-//W3C//ENTITIES Full Latin 1//EN//HTML" > <!ENTITY % HTMLsymbol PUBLIC "-//W3C//ENTITIES Symbolic//EN//HTML" > <!ENTITY % HTMLspecial PUBLIC "-//W3C//ENTITIES Special//EN//HTML" > <!-- Reference character entities --> %HTMLlat1;%HTMLsymbol;%HTMLspecial; <!-- Part 2 - Document structure --> <!-- Further normative requirements on the elements defined in this part of the DTD are provided in Part 3.--> <!-- ELEMENTS MIN CONTENT (EXCEPTIONS) --> <!ELEMENT HTML - - (HEAD, BODY) > <!ELEMENT HEAD - O (TITLE) +(LINK | META | STYLE) > <!ELEMENT TITLE - - (#PCDATA) -(LINK | META | STYLE) > <!ELEMENT LINK - O EMPTY > <!ELEMENT META - O EMPTY > <!ELEMENT STYLE - - CDATA > <!-- The following marked section is informative only --> <![ %Preparation; [ <!ELEMENT Pre-HTML - - (HEAD, BODY) > <!ATTLIST Pre-HTML %i18n; -- Internationalization DIR and LANG --> <!ELEMENT BODY - O ((%block;)*,(H1,DIV1)* ) +(DEL|INS) > <!ELEMENT H1 - - (%text;)+ > <!ELEMENT DIV1 O O ((%block;)*, (H2,DIV2)* ) > <!ELEMENT H2 - - (%text;)+ > <!ELEMENT DIV2 O O ((%block;)*, (H3,DIV3)* ) > <!ELEMENT H3 - - (%text;)+ > <!ELEMENT DIV3 O O ((%block;)*, (H4,DIV4)* ) > <!ELEMENT H4 - - (%text;)+ > <!ELEMENT DIV4 O O ((%block;)*, (H5,DIV5)* ) > <!ELEMENT H5 - - (%text;)+ > <!ELEMENT DIV5 O O ((%block;)*, (H6,DIV6)* ) > <!ELEMENT H6 - - (%text;)+ > <!ELEMENT DIV6 O O ((%block;)*) > ]]> <!-- The following marked section is normative --> <![ %NoPreparation; [ <!ELEMENT BODY - O (%block;|H1|H2|H3|H4|H5|H6)+ +(DEL|INS) > <!ELEMENT (H1|H2|H3|H4|H5|H6) - - (%text;)+ > ]]> <!ELEMENT DIV - - %section.content; > <!ELEMENT ADDRESS - - (%text;)+ -(IMG|OBJECT|MAP) > <!ELEMENT P - O (%text;)+ > <!ELEMENT (OL|UL) - - (LI)+ > <!ELEMENT LI - O (%text; | %block;)+ > <!ELEMENT DL - - (DT|DD)+ > <!ELEMENT DT - O (%text;)+ > <!ELEMENT DD - O %section.content; -(ADDRESS) > <!ELEMENT PRE - - (%text;)+ -(IMG|MAP|OBJECT|SUB|SUP) > <!ELEMENT BLOCKQUOTE - - (%block;)+ > <!ELEMENT Q - - (%text;)+ > <!ELEMENT FORM - - (%block;)+ -(FORM) > <!-- #PCDATA required to absorb leading white space --> <!ELEMENT FIELDSET - - (#PCDATA,LEGEND,(%block; | %text; | ADDRESS)+) -(FIELDSET) > <!ELEMENT INPUT - O EMPTY > <!ELEMENT BUTTON - - (%text;)+ -(A|FIELDSET|FORM|%form.fields;) > <!ELEMENT LABEL - - (%text;)+ -(LABEL) > <!ELEMENT LEGEND - - (#PCDATA) > <!ELEMENT SELECT - - (OPTGROUP|OPTION)+ > <!ELEMENT OPTGROUP - - (OPTION)+ > <!ELEMENT OPTION - O (#PCDATA) > <!ELEMENT TEXTAREA - - (#PCDATA) > <!ELEMENT HR - O EMPTY > <!ELEMENT TABLE - - (CAPTION?, (COL*|COLGROUP*), THEAD?, TFOOT?, TBODY+) > <!ELEMENT CAPTION - - (%text;)+ > <!ELEMENT (THEAD,TFOOT,TBODY) - O (TR)+ > <!ELEMENT COL - O EMPTY > <!ELEMENT COLGROUP - O (COL)* > <!ELEMENT TR - O (TH|TD)+ > <!ELEMENT (TH|TD) - O %table.content; > <!ELEMENT (%logical.styles;|%physical.styles;) - - (%text;)+ > <!ELEMENT A - - (%text;)* -(A) > <!ELEMENT IMG - O EMPTY > <!ELEMENT OBJECT - - (PARAM | %section.content;)* > <!ELEMENT PARAM - O EMPTY > <!ELEMENT BR - O EMPTY > <!-- Authors should use the block-level content of the <MAP> element when creating accessible documents. Each region should be specified using an <A> element to define its associated link and shape. User agents should render the block-level content of a <MAP> element. --> <!ELEMENT MAP - - ((%block;)|AREA)+ > <!ELEMENT AREA - O EMPTY > <!ELEMENT SPAN - - (%text;)+ > <!ELEMENT (DEL|INS) - - (%text;)+ > <!ELEMENT BDO - - (%text;)+ > <!-- Part 3 - Attribute definition lists --> <!-- ELEMENTS NAME VALUE DEFAULT --> <!ATTLIST A --Case shall not be taken into account when determining a match between an ID value and a NAME value, between an ID value and an HREF value or between a NAME value and an HREF value. Comparisons should be made with the values folded to upper case. The NAME attribute value specification shall be processed as if the declared value were NAME. It is recommended that authors of HTML documents specify both ID and NAME attributes, and use values restricted to the 40 characters "ABCDEFGHIJKLMNOPQRSTUVWXYZ.-_:0123456789". When both attributes are specified, they shall have identical values. COORDS shall not be specified if SHAPE has the value `default'. -- %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG -- ACCESSKEY CDATA #IMPLIED -- Accessibility key character -- CHARSET CDATA #IMPLIED -- Character encoding as per RFC2045 -- COORDS CDATA #IMPLIED -- Comma separated list of values -- HREF %URI; #IMPLIED -- Source anchor is URI of target -- HREFLANG NAME #IMPLIED -- Language code of resource -- NAME CDATA #IMPLIED -- Target anchor -- REL CDATA #IMPLIED -- Forward link types -- REV CDATA #IMPLIED -- Reverse link types -- SHAPE %shape; rect -- Control interpretation of coords -- TABINDEX NUMBER #IMPLIED -- Position in tabbing order -- TYPE CDATA #IMPLIED -- Advisory content type --> <!ATTLIST ADDRESS %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG --> <!ATTLIST AREA --One of HREF or NOHREF shall be specified. COORDS shall not be specified if SHAPE has the value `default'. Authors are very strongly recommended to provide meaningful ALT attributes to support interoperability with speech-based or text-only agents. The language and direction of the text provided by the ALT attribute are defined by the containing elements. -- %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG -- ACCESSKEY CDATA #IMPLIED -- Accessibility key character -- ALT CDATA #REQUIRED -- Description for text-only UAs -- COORDS CDATA #IMPLIED -- Comma separated list of values -- HREF %URI; #IMPLIED -- This region acts as hypertext link -- NOHREF (nohref) #IMPLIED -- This region has no action -- SHAPE %shape; rect -- Control interpretation of coords -- TABINDEX NUMBER #IMPLIED -- Position in tabbing order --> <!ATTLIST BDO %core; -- Element CLASS, ID and TITLE -- DIR (ltr|rtl) #REQUIRED -- Direction of writing -- LANG NAME #IMPLIED -- RFC1766 language value --> <!ATTLIST BLOCKQUOTE --The contents of the <BLOCKQUOTE> element shall not be surrounded with quotation marks. These may be added by the user agent through the use of a style sheet. -- %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG -- CITE %URI; #IMPLIED -- URI for source document or message --> <!ATTLIST BODY %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG --> <!ATTLIST BR %core; -- Element CLASS, ID and TITLE --> <!ATTLIST BUTTON --The <BUTTON> element shall not contain the <A>, <BUTTON>, <FIELDSET>, <FORM>, <INPUT>, <LABEL>, <SELECT> or <TEXTAREA> elements. If the <BUTTON> element contains an <IMG> element, the <IMG> shall not have an ISMAP or USEMAP attribute. The TYPE attribute shall be provided, and when the TYPE is specified as `submit', the NAME and VALUE attributes shall be provided. The NAME attribute is required if the TYPE attribute has the value `submit'. If the TYPE attribute has value `reset', and the <BUTTON> is contained in a <FIELDSET>, the reset action is limited to the contents of the <FIELDSET>. The VALUE attribute is required if the TYPE attribute has the value `submit' and specifies the value to be returned if the button is selected. The <BUTTON> element should be used only in the content of a <FORM> element. -- %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG -- ACCESSKEY CDATA #IMPLIED -- Accessibility key character -- DISABLED (disabled) #IMPLIED -- Control unavailable in this context -- NAME CDATA #IMPLIED -- Required for all except submit, reset -- TABINDEX NUMBER #IMPLIED -- Position in tabbing order -- TYPE (submit|reset) submit -- For use as form submit/reset button -- VALUE CDATA #IMPLIED -- Passed to server when submitted --> <!ATTLIST CAPTION %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG --> <!ATTLIST COL %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG -- SPAN NUMBER 1 -- Number of cols spanned --> <!ATTLIST COLGROUP --The SPAN attribute should only be used if the <COLGROUP> element has no content. -- %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG -- SPAN NUMBER 1 -- Number of cols spanned by group --> <!ATTLIST DD %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG --> <!ATTLIST DEL %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG -- CITE %URI; #IMPLIED -- Information on reason for change -- DATETIME CDATA #IMPLIED -- When changed, subset of ISO/IEC 8601 --> <!ATTLIST DIV %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG --> <!ATTLIST DL %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG --> <!ATTLIST DT %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG --> <!ATTLIST FIELDSET %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG --> <!ATTLIST FORM %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG -- ACCEPT CDATA #IMPLIED -- List of MIME types for file upload -- ACCEPT-CHARSET CDATA #IMPLIED -- List of supported char sets -- ACTION %URI; #REQUIRED -- Server-side form handler -- ENCTYPE %Content-Type; "application/x-www-form-urlencoded" METHOD %HTTP-Method; get -- See HTTP specification --> <!ATTLIST HEAD %i18n; -- Internationalization DIR and LANG -- PROFILE %URI; #IMPLIED -- Named dictionary of meta info --> <!ATTLIST HR %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG --> <!ATTLIST HTML %i18n; -- Internationalization DIR and LANG --> <!ATTLIST (H1 | H2 | H3 | H4 | H5 | H6) --The <H1> element shall not be followed by an <H3>, <H4>, <H5> or <H6> element without an intervening <H2> element. The <H2> element shall not be followed by an <H4>, <H5> or <H6> element without an intervening <H3> element. The <H3> element shall not be followed by an <H5> or <H6> element without an intervening <H4> element. The <H4> element shall not be followed by an <H6> element without an intervening <H5> element. An <H2> element shall be preceded by an <H1> element. An <H3> element shall be preceded by an <H2> element. An <H4> element shall be preceded by an <H3> element. An <H5> element shall be preceded by an <H4> element. An <H6> element shall be preceded by an <H5> element. -- %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG --> <!ATTLIST IMG --If the <IMG> element is contained in a <BUTTON> element, the <IMG> shall not have an ISMAP or USEMAP attribute. If the ISMAP attribute is present in an <IMG> element, that <IMG> element shall be contained in an <A> element with an HREF attribute present. At most one of the attributes ISMAP and USEMAP may be provided. -- %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG -- ALT CDATA #REQUIRED -- Text for text-only user agent -- ISMAP (ismap) #IMPLIED -- Use server image map -- LONGDESC %URI; #IMPLIED -- Extended description for text UA -- SRC %URI; #REQUIRED -- URI of image to embed -- USEMAP %URI; #IMPLIED -- Use client-side image map --> <!ATTLIST INPUT --If the attribute TYPE has the value `checkbox', values shall be provided for the NAME and VALUE attributes. If the attribute TYPE has the value `file', a value shall be provided for the NAME attribute; HTML interpreting agents should request user confirmation of any default file names that might be suggested, and fields specifying files shall not be hidden. If the attribute TYPE has the value `hidden', values shall be provided for the NAME and VALUE attributes. If the attribute TYPE has the value `password', a value shall be provided for the NAME attribute. If the attribute TYPE has the value `radio', values shall be provided for the the NAME and VALUE attributes. At all times, one and only one of the radio buttons shall be checked. Initially, if none of the <INPUT> elements in a set of radio buttons specifies CHECKED, then the user agent shall mark the first radio button of the set as checked. If the attribute TYPE has the value `submit', and a value is specified for the VALUE attribute, then a value shall be provided for the NAME attribute. If the attribute TYPE has the value `text', values shall be provided for the NAME and VALUE attributes. The MAXLENGTH and TABINDEX values shall be considered as integers with any leading zeroes ignored. The <INPUT> element should be used only in the content of a <FORM> element. -- %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG -- ACCEPT CDATA #IMPLIED -- List of MIME types for file upload -- ACCESSKEY CDATA #IMPLIED -- Accessibility key character -- CHECKED (checked) #IMPLIED -- For radio buttons, checkboxes -- DISABLED (disabled) #IMPLIED -- Control unavailable in this context -- MAXLENGTH NUMBER #IMPLIED -- Max chars for text fields -- NAME CDATA #IMPLIED -- Required for all except submit, reset -- READONLY (READONLY) #IMPLIED -- For text -- SIZE CDATA #IMPLIED -- Specific to each type of field -- TABINDEX NUMBER #IMPLIED -- Position in tabbing order -- TYPE %InputType; text -- Widget -- VALUE CDATA #IMPLIED -- Required for radio, checkboxes --> <!ATTLIST INS %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG -- CITE %URI; #IMPLIED -- Information on reason for change -- DATETIME CDATA #IMPLIED -- When changed, subset of ISO/IEC 8601 --> <!ATTLIST LABEL --The <LABEL> element shall refer to a form field in the content of the <FORM> element which contains the <LABEL>. The <LABEL> element should be used only in the content of a <FORM> element. -- %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG -- ACCESSKEY CDATA #IMPLIED -- Accessibility key character -- FOR IDREF #IMPLIED -- Points to associated field --> <!ATTLIST LEGEND %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG -- ACCESSKEY CDATA #IMPLIED -- Accessibility key character --> <!ATTLIST LI %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG --> <!ATTLIST LINK %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG -- CHARSET CDATA #IMPLIED -- Character encoding as per RFC2045 -- HREF %URI; #IMPLIED -- URI for link resource -- HREFLANG NAME #IMPLIED -- Language code of resource -- MEDIA CDATA #IMPLIED -- Destination media of referenced doc -- REL CDATA #IMPLIED -- Forward link types -- REV CDATA #IMPLIED -- Reverse link types -- TYPE CDATA #IMPLIED -- Advisory Internet content type --> <!ATTLIST MAP --The value of the NAME attribute is case sensitive, and the attribute value specification shall be processed as if the declared value were NAME. It is recommended that authors of HTML documents specify both ID and NAME attributes, and use values restricted to the 40 characters "ABCDEFGHIJKLMNOPQRSTUVWXYZ.-_:0123456789". When both attributes are specified, they shall have identical values. -- %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG -- NAME CDATA #REQUIRED -- Referenced by USEMAP in <IMG> --> <!ATTLIST META %i18n; -- Internationalization DIR and LANG -- CONTENT CDATA #REQUIRED -- Associated information -- HTTP-EQUIV NAME #IMPLIED -- HTTP response header name -- NAME NAME #IMPLIED -- Meta-information name -- SCHEME CDATA #IMPLIED -- Nature of content --> <!ATTLIST OBJECT %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG -- CLASSID %URI; #IMPLIED -- Identifies implementation -- CODEBASE %URI; #IMPLIED -- Needed by some systems -- CODETYPE CDATA #IMPLIED -- Internet content type for code -- DATA %URI; #IMPLIED -- Reference to objects data -- DECLARE (declare) #IMPLIED -- Flag: declare but dont instantiate -- NAME CDATA #IMPLIED -- Submit as part of form -- STANDBY CDATA #IMPLIED -- Show this msg while loading -- TABINDEX NUMBER #IMPLIED -- Position in tabbing order -- TYPE CDATA #IMPLIED -- Internet content type for data -- USEMAP %URI; #IMPLIED -- Reference to image map --> <!ATTLIST OL %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG --> <!ATTLIST OPTGROUP %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG -- DISABLED (disabled) #IMPLIED -- Control unavailable in this context -- LABEL CDATA #REQUIRED -- For use in hierarchical menus --> <!ATTLIST OPTION %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG -- DISABLED (disabled) #IMPLIED -- Control unavailable in this context -- LABEL CDATA #IMPLIED -- For use in hierarchical menus -- SELECTED (selected) #IMPLIED -- Pre-selected option -- VALUE CDATA #IMPLIED -- Defaults to content --> <!ATTLIST P %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG --> <!ATTLIST PARAM ID ID #IMPLIED -- Document-wide unique id -- NAME CDATA #REQUIRED -- Name of parameter -- TYPE CDATA #IMPLIED -- Internet Media Type -- VALUE CDATA #IMPLIED -- Value of parameter -- VALUETYPE (data|ref|object) data -- Interpret value as --> <!ATTLIST PRE %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG --> <!ATTLIST Q --The textual contents of the <Q> element shall not be surrounded with quotation marks. These may be added by the user agent through the use of a style sheet. -- %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG -- CITE %URI; #IMPLIED -- URI for source document or message --> <!ATTLIST SELECT --The <SELECT> element should be used only in the content of a <FORM> element. -- %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG -- DISABLED (disabled) #IMPLIED -- Control unavailable in this context -- MULTIPLE (multiple) #IMPLIED -- Default is single selection -- NAME CDATA #REQUIRED -- Field name -- SIZE NUMBER #IMPLIED -- Rows visible -- TABINDEX NUMBER #IMPLIED -- Position in tabbing order --> <!ATTLIST SPAN %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG --> <!ATTLIST STYLE --The <STYLE> element contains style sheet information which shall be passed to the user agent's style manager. Any style sheet language may be used. It is a user agent error to render the style sheet information as if it were part of a document's text. -- %i18n; -- Internationalization DIR and LANG -- MEDIA CDATA #IMPLIED -- Designed for use with these media -- TITLE CDATA #IMPLIED -- Advisory title -- TYPE CDATA #REQUIRED -- Internet content type for style lang. --> <!ATTLIST TABLE %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG -- SUMMARY CDATA #REQUIRED -- Purpose/structure for speech output --> <!ATTLIST TBODY %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG --> <!ATTLIST TD %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG -- ABBR CDATA #IMPLIED -- Abbreviation for header cell -- AXIS CDATA #IMPLIED -- Names groups of related headers -- COLSPAN NUMBER 1 -- Number of columns spanned by cell -- HEADERS IDREFS #IMPLIED -- List of ID's for header cells -- ROWSPAN NUMBER 1 -- Number of rows spanned by cell -- SCOPE (col|colgroup|row|rowgroup) #IMPLIED -- Scope covered by header cells --> <!ATTLIST TEXTAREA --The <TEXTAREA> element should be used only in the content of a <FORM> element. -- %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG -- ACCESSKEY CDATA #IMPLIED -- Accessibility key character -- COLS NUMBER #REQUIRED -- Number required in av char widths -- DISABLED (disabled) #IMPLIED -- Control unavailable in this context -- NAME CDATA #REQUIRED -- Name of form field -- READONLY (readonly) #IMPLIED -- For text -- ROWS NUMBER #REQUIRED -- Number of rows required -- TABINDEX NUMBER #IMPLIED -- Position in tabbing order --> <!ATTLIST TFOOT %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG --> <!ATTLIST TH %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG -- ABBR CDATA #IMPLIED -- Abbreviation for header cell -- AXIS CDATA #IMPLIED -- Names groups of related headers -- COLSPAN NUMBER 1 -- Number of columns spanned by cell -- HEADERS IDREFS #IMPLIED -- List of ID's for header cells -- ROWSPAN NUMBER 1 -- Number of rows spanned by cell -- SCOPE (col|colgroup|row|rowgroup) #IMPLIED -- Scope covered by header cells --> <!ATTLIST THEAD %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG --> <!ATTLIST TITLE %i18n; -- Internationalization DIR and LANG --> <!ATTLIST TR %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG --> <!ATTLIST UL %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG --> <!-- Attribute group definition lists --> <!ATTLIST (%physical.styles;) %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG --> <!ATTLIST (%logical.styles;) %core; -- Element CLASS, ID and TITLE -- %i18n; -- Internationalization DIR and LANG --> <!-- End of file -->
Every effort has been made to provide a language specification that is correct and rigorously specified. However since change is inevitable, facilities have been provided to manage the maintenance of this text.
Error notifications should be made via your national body or via a liaison organization such as the World Wide Web Consortium.
The defects in reports 1 through 6 have been corrected by Technical Corrigendum 1. The remaining defect reports are working documents for use by JTC1/SC34 and the editors of the International Standard. They should be considered as Work in Progress and should not be used for reference.
Defects are corrected following the procedure for "rapid promulgation" [JTC1 14.4.2.3] specified in clauses 14.4.3 through 14.4.10 of the JTC1 directives
NOTE:P We present defects in a style based on form G17 in the JTC1 Directives.
Defect report number: DR 15445/001
WG Secretariat: Project editors
Date circulated by WG Secretariat: 2000-12-10
Deadline for response from editor: 2000-12-10
Submitter: W3C
For review by: JTC1/SC34/WG3 members
Defect report concerning: ISO/IEC 15445:2000 HyperText Markup Language (HTML)
Qualifier: Omission
References:
Nature of defect: The formal public identifier -//W3C//ENTITIES Full Latin 1//EN//HTML used by ISO/IEC 15445:2000 for the ISO Latin alphabet No. 1 entities contains the public text description `Full Latin 1' and not `Latin 1' or `Latin1' as used by the W3C Recommendations for HTML 4.0 and 4.01.
Solution proposed by submitter: Allow a range of formal public identifiers in the catalog file.
This is a technical defect in the International Standard. We recommend accepting the submitter's proposal. See the new text introduced into the International Standard (highlighted in yellow), a description of proposed solution and the required Technical Corrigendum.
Defect report number: DR 15445/002
WG Secretariat: Project editors
Date circulated by WG Secretariat: 2000-12-10
Deadline for response from editor: 2000-12-10
Submitter: W3C
For review by: JTC1/SC34/WG3 members
Defect report concerning: ISO/IEC 15445:2000 HyperText Markup Language (HTML)
Qualifier: Omission
References:
Nature of defect: The formal public identifier -//W3C//ENTITIES Symbolic//EN//HTML used by ISO/IEC 15445:2000 for the symbol entities contains the public text description `Symbolic' and not `Symbols' as amended in the W3C Recommendation for HTML 4.01.
Solution proposed by submitter: Allow a range of formal public identifiers in the catalog file.
This is a technical defect in the International Standard. We recommend accepting the submitter's proposal. See the new text introduced into the International Standard (highlighted in yellow), a description of proposed solution and the required Technical Corrigendum.
Defect report number: DR 15445/003
WG Secretariat: Project editors
Date circulated by WG Secretariat: 2000-12-10
Deadline for response from editor: 2000-12-10
Submitter: Project editors
For review by: JTC1/SC34/WG3 members
Defect report concerning: ISO/IEC 15445:2000 HyperText Markup Language (HTML)
Qualifier: Omission
References:
Nature of defect: Subclause 12.2.3 Anchors with the id attribute of the W3C Recommendation for HTML
4.01 now
specifies that it is legal for attributes ID
and NAME
to appear
in the same start tag when they are both defined for an element, and
that they must have identical values. ISO/IEC 15445:2000 first
edition recommended use of the ID
attribute but required that the
ID
and NAME
values be distinct [Annex B, part 1, parameter
entity core
]. Note that the W3C Recommendation for HTML
4.01 permits use of
both attributes to specify an element's unique identifier for the
elements: <A>
[W3C 12.2], <APPLET>
[W3C 13.4], <FORM>
[W3C 17.3], <FRAME>
[W3C 16.2.2], <IFRAME>
[W3C 16.5], <IMG>
[W3C 13.2] and
<MAP>
[W3C 13.6.1], but of these, <APPLET>
[W3C 13.4], <FRAME>
[W3C 16.2.2] and <IFRAME>
[W3C 16.5] are excluded
from the International Standard, and <FORM>
[W3C 17.3] and <IMG>
[W3C 13.2] have no NAME
attribute.
Solution proposed by submitter: Change the corresponding
normative text in the ISO-HTML DTD to allow attributes NAME
and
ID
to appear in the same start tag when they are both defined for
an element, and require that they have identical values.
This is a technical defect in the International Standard. We recommend accepting the submitter's proposal. See the new text introduced into the International Standard, two short descriptions of the proposed solutions, here and here, and the required Technical Corrigendum.
Defect report number: DR 15445/004
WG Secretariat: Project editors
Date circulated by WG Secretariat: 2000-12-10
Deadline for response from editor: 2001-01-31
Submitter: Project editors
For review by: JTC1/SC34/WG3 members
Defect report concerning: ISO/IEC 15445:2000 HyperText Markup Language (HTML)
Qualifier: Omission
References:
Nature of defect: Subclause 13.6.1 Client-side image maps of the W3C Recommendation for HTML
4.01
introduces an extended mixed content model for the <MAP>
[W3C 13.6.1] element type
((%block;) | AREA)+
which allows
%block;
elements in addition to <AREA>
[W3C 13.6.1] elements, and
recommends rendering the block-level content to improve accessibility.
ISO/IEC 15445:2000 provides only <AREA>
[W3C 13.6.1] elements.
Solution proposed by submitter:
The W3C HTML WG have advised us that
<MAP>
[W3C 13.6.1] element type is
essential for accessibility.
We recommend that ISO/IEC 15445:2000 provide the same support for
accessibility as the W3C Recommendation for HTML
4.01, by extending the <MAP>
[W3C 13.6.1] element type
content model to ((%block;) | AREA)+
and adding
SHAPE
and COORDS
attributes to <A>
[W3C 12.2]. Note that the restricted
definition of the %block; parameter
entity in ISO/IEC 15445:2000 prevents %heading
; and <ADDRESS>
[W3C 7.5.6]
elements appearing in a client side map.
Defect report number: DR 15445/005
WG Secretariat: Project editors
Date circulated by WG Secretariat: 2000-12-10
Deadline for response from editor: 2000-12-10
Submitter: Project editors
For review by: JTC1/SC34/WG3 members
Defect report concerning: ISO/IEC 15445:2000 HyperText Markup Language (HTML)
Qualifier: Omission
References:
Nature of defect: The International Standard refers to HTML 4.0 `as ammended by the W3C errata', however the W3C have made HTML 4.01 the specification of the `HTML 4' language, and there are now no W3C errata. References in the International Standard to the W3C errata are now incorrect.
Solution proposed by submitter: Make the W3C Recommendation for HTML 4.01 the reference text.
This is a technical defect in the International Standard. We recommend accepting the submitter's proposal. See the required Technical Corrigendum.
Defect report number: DR 15445/006
WG Secretariat: Project editors
Date circulated by WG Secretariat: 2001-01-15
Deadline for response from editor: 2001-01-31
Submitter: Project editors
For review by: JTC1/SC34/WG3 members
Defect report concerning: ISO/IEC 15445:2000 HyperText Markup Language (HTML)
Qualifier: Omission
References:
<FORM>
[W3C 17.3] element type.
Nature of defect: The <FORM>
[W3C 17.3] element type specified by W3C
HTML 4 has the content model (%block;|SCRIPT)+
. The
content model for the same element type defined by ISO/IEC 15445:2000
is (%block; | %text; | %form.fields; |
ADDRESS)+
which allows text content. This `generosity' allows
authors to create documents which conform to ISO/IEC 15445 but do not
conform to W3C HTML 4. This is a defect since all documents which
conform to ISO/IEC 15445 should also conform to the W3C Recommendation for HTML
4.01.
Solution proposed by submitter: Make the following changes to the ISO/IEC 15445 DTD:
<FORM>
[W3C 17.3] element type to
(%block;)+
to obtain:
<!ELEMENT FORM - - (%block;)+ -(FORM) >
<!ENTITY % text '#PCDATA | %physical.styles; | %logical.styles; | %special;
| %form.fields;' >
<FIELDSET>
[W3C 17.10] element type to
<!ELEMENT FIELDSET - - (#PCDATA,LEGEND,(%block; | %text; | ADDRESS)+)
-(FIELDSET) >
%form.fields;
from the declaration of the element type
<LABEL>
[W3C 17.9.1] to obtain:
<!ELEMENT LABEL - - (%text;)+ -(LABEL) >
%form.content;
.
This is a technical defect in the International Standard. We recommend accepting the submitter's proposal. See the required Technical Corrigendum.
Defect report number: DR 15445/007
WG Secretariat: Project editors
Date circulated by WG Secretariat: tba
Deadline for response from editor: tba
Submitter: Project editors
For review by: JTC1/SC34/WG3 members
Defect report concerning: ISO/IEC 15445:2000 HyperText Markup Language (HTML)
Qualifier: Omission
References:
Nature of defect: The International Standard identifies the case folding contradiction and says that "case must not be taken into account", but does not say what is required of the authors.
Solution proposed by submitter: Add text to the DTD to
recommend that authors satisfy the competing requirements of SGML and
the W3C Recommendation for HTML
4.01 by restricting themselves to the 40 characters
"ABCDEFGHIJKLMNOPQRSTUVWXYZ.-_:0123456789" for ID
and NAME
values, and for the corresponding HREF
values.
This is a technical defect in the International Standard. We recommend accepting the submitter's proposal. See the proposed Draft Technical Corrigendum.
TYPE=reset
Defect report number: DR 15445/008
WG Secretariat: Project editors
Date circulated by WG Secretariat: tba
Deadline for response from editor: tba
Submitter: Edward Welbourne
For review by: JTC1/SC34/WG3 members
Defect report concerning: ISO/IEC 15445:2000 HyperText Markup Language (HTML)
Qualifier: Omission
References:
TYPE
=reset
.
Nature of defect: When a <BUTTON>
[W3C 17.5] has attribute
TYPE
=reset
, its effects are limited by any
enclosing <FIELDSET>
[W3C 17.10]; but when an <INPUT>
[W3C 17.4] has attribute
TYPE
=reset
, which should have the same effect,
there is no statement in the International Standard of the limitation due to an
enclosing <FIELDSET>
[W3C 17.10].
Solution proposed by submitter: Add text to the DTD to state the limitation.
This is an omission in the International Standard. We recommend accepting the submitter's proposal. See the proposed Draft Technical Corrigendum.
Defect report number: DR 15445/009
WG Secretariat: Project editors
Date circulated by WG Secretariat: tba
Deadline for response from editor: tba
Submitter: Russell O'Connor
For review by: JTC1/SC34/WG3 members
Defect report concerning: ISO/IEC 15445:2000 HyperText Markup Language (HTML)
Qualifier: Omission
References:
Nature of defect: Clause 9.2 provides an architectural support declaration using a PI-based syntax. However ISO/IEC 10744:1997 (HyTime) in Annex A.3 provides a different syntax based on an attribute definition list declaration. The International Standard offers no explanation for this discrepancy.
Solution proposed by submitter: Add the second syntax.
The two syntaxes are both valid, but the PI-based syntax has not yet been published. We recommend accepting the submitter's proposal. See the proposed Draft Technical Corrigendum.
The alternative syntax proposed by the submitter is used in the production of the International Standard and the User's Guide.
There is an excellent online bibliography by Robin Cover for SGML and XML topics. Detailed references for international standards are available at the ISO's WWW site and details of the ISO/IEC JTC1 programme of work are available at the JTC1 WWW site. W3C documents will be found at the W3C site. The IETF RFC's will be found at the IETF WWW site. A bibliography of ``Dublin Core Relevant Publications'' is available.
The figure illustrates the different ways of refering to a character. Each character is given a name such as "CAPITAL LETTER E WITH GRAVE ACCENT", and the characters are placed in an ordered set known as the character repertoire. The elements (the characters) of the set are assigned decimal numbers 0, 1, 2, 3, and so on. The decimal number for a character is called the code position, and the code position for "CAPITAL LETTER E WITH GRAVE ACCENT" is 200. SGML calls these decimal numbers the character numbers.
The function from "code position" to "character name" is called the coded character set by RFC 1866. The second column in the figure shows the code position as a hexadecimal value which represents a binary pattern. The ordered set of binary patterns is called the code set by SGML.
The 1 to 1 relation between the binary pattern and the character name is called the coded character set by ISO 8859-1. The function from name to pattern is called character set by SGML and the function from pattern to name is called character encoding scheme by RFC 1866.
To facilitate entry of characters not on a keyboard, entity sets such
as "ISO latin 1" provide entities for accented characters. The
"CAPITAL LETTER E WITH GRAVE ACCENT" may be entered as
È
. A character may also be entered using its
decimal code position in the form of a numeric character
reference, such as È
. The figure
also provides in the final column an approximation for the printed
glyph.
NOTE: An interesting use of numeric character references is to obfuscate the markup of an e-mail address in a web page, so that it is not harvested by spam-bots.
The figure illustrates the progressive nesting of sections. The model
is one of geographic entities containing one another. The sections
have a rank: An <H1>
is called a
continent, an <H2>
is called a
country, an <H3>
is called a
province, an <H4>
is called a
city, and so on. The idea is that a province may contain a
city but not the other way around. The nesting must also be
progressive, ie. if a continent contains a province, there must be an
intermediate country.
An <H1>
continent may contain more than one
<H2>
country, and a <H2>
country may contain more than one <H3>
province.
The figure shows an graphic containing a row of five equal circles. The circles are inscribed with regular polygons: a triangle, a square, a hexagon (6 sided), a decagon (10 sided) and a duodecagon (12 sided). Clicking on one of the polygons leads to a text giving a formula for the surface area.
The figure shows a "document in preparation" which contains:
The figure shows a piece of the catalogue which associates the entity "legal" with the file legal.txt. The figure shows that the file legal.txt contains a declaration of the general entity "fineprint". The reference to parameter entity "legal" has the effect of declaring the entity "fineprint", which is then available for reference in the body of the document.
NOTE: The legal text is in such a small font that it is impossible to read it.
ICS 35.240.30
Price of the International Standard is based on a printed size of 20 pages.
PURL: http://purl.org/NET/ISO+IEC.15445/Users-Guide.html
Last change was on
2003-04-24Z10:10:18 UTC