xml-stylesheet 5

Work in Progress — Last Update 4 February 2008

This version:
...
Latest version:
...
Previous versions:
...
Editor:
Simon Pieters, Opera Software, simonp@opera.com

Abstract

...

Table of Contents


1. Conformance Requirements

...

1.1. Dependencies

...DOM Core, XML...

2. Writing PIs with pseudo-attributes

PIs that are said to follow the rules for parsing PIs with pseudo-attributes must have the data attribute match the PIData production below.

[1] PIData ::= (S* PseudoAtt (S PseudoAtt)*)? S*
[2] S ::= (#x9 | #xA | #xD | #x20)+
[3] PseudoAtt ::= Name Eq (SingleQuoted | DoubleQuoted) [CC: Unique PseudoAtt Spec]
[4] Name ::= (Char - ('=' |S))+
[5] Eq ::= S* '=' S*
[6] SingleQuoted ::= "'" (AttContent - "'")* "'"
[7] DoubleQuoted ::= '"' (AttContent - '"')* '"'
[8] AttContent ::= Char - ('<' | '&') | EntityRef | CharRef
[9] EntityRef ::= '&amp;' | '&lt;' | '&gt;' | '&quot;' | '&apos;'
[10] CharRef ::= '&#' [0-9]+ ';' | '&#x' [0-9a-fA-F]+ ';' [CC: Legal Character]
[10] Char ::= [#x0-#x10FFFF] /* Any Unicode code point */

2.1. Conformance constraints

Unique PseudoAtt Spec

A pseudo-attribute name must not appear more than once.

Legal Character

Characters referred to using character references must not refer to U+0000 or U+D800..U+DFFF.

3. Parsing PIs with pseudo-attributes

The rules for parsing PIs with pseudo-attributes are defined in this section. The UA must follow these rules whenever the PI's data attribute changes, and whenever the PI is inserted to the DOM or moved in the DOM (which might be a result of changing nearby nodes).

When the UA hits a parse error, it must act as described below, and may also inform the user that there was an error (e.g. in the error console).

When the UA is to stop parsing, it must stop the state machine so that pseudo-attributes can be processed.

A pseudo-attribute has a name and a value. When a new pseudo-attribute is created, its name and value must be set the empty string.

A pseudo-attribute can be marked as being in error. This will result in the pseudo-attribute being ignored when it has been completely parsed.

The next input character is the first character in the PI's data attribute that has not yet been consumed. Initially the next input character is the first character in the attribute.

"EOF" is a conceptual character representing the end of the PI's data attribute.

Let pseudo-attributes be the empty array. Start in the before name state.

The state machine is as follows:

Before name state

Consume the next input character:

U+0009 CHARACTER TABULATION
U+000A LINE FEED (LF)
U+000D CARRIAGE RETURN (CR)
U+0020 SPACE
Stay in the before name state.
U+003D EQUALS SIGN (=)
Parse error. Stop parsing.
EOF
Stop parsing.
Anything else
Create a new pseudo-attribute and append the input character to its name. Switch to the name state.
Name state

Consume the next input character:

U+0009 CHARACTER TABULATION
U+000A LINE FEED (LF)
U+000D CARRIAGE RETURN (CR)
U+0020 SPACE
If there is a pseudo-attribute in pseudo-attributes that has the same name as this pseudo-attribute, then this is a parse error; mark the pseudo-attribute as being in error. In any case, switch to the after name state.
U+003D EQUALS SIGN (=)
If there is a pseudo-attribute in pseudo-attributes that has the same name as this pseudo-attribute, then this is a parse error; mark the pseudo-attribute as being in error. In any case, switch to the before value state.
EOF
Parse error. Stop parsing.
Anything else
Append the input character to the pseudo-attribute's name. Stay in the name state.
After name state

Consume the next input character:

U+0009 CHARACTER TABULATION
U+000A LINE FEED (LF)
U+000D CARRIAGE RETURN (CR)
U+0020 SPACE
Stay in the after name state.
U+003D EQUALS SIGN (=)
Switch to the before value state.
Anything else
Parse error. Stop parsing.
Before value state

Consume the next input character:

U+0009 CHARACTER TABULATION
U+000A LINE FEED (LF)
U+000D CARRIAGE RETURN (CR)
U+0020 SPACE
Stay in the before value state
U+0022 QUOTATION MARK (")
Switch to the value (double-quoted) state.
U+0027 APOSTROPHE (')
Switch to the value (single-quoted) state.
Anything else
Parse error. Stop parsing.
Value (double-quoted) state

Consume the next input character:

U+0022 QUOTATION MARK (")
If the pseudo-attribute is not in error, then append the pseudo-attribute to pseudo-attributes. In any case, switch to the after value state.
U+0026 AMPERSAND (&)
Attempt to consume an entity.
U+003C LESS-THAN SIGN (<)
Parse error. Mark the pseudo-attribute as being in error.
EOF
Parse error. Stop parsing.
Anything else
Append the character to the pseudo-attribute's value.
Value (single-quoted) state

Consume the next input character:

U+0027 APOSTROPHE (')
If the pseudo-attribute is not in error, then append the pseudo-attribute to pseudo-attributes. In any case, switch to the after value state.
U+0026 AMPERSAND (&)
Attempt to consume an entity.
U+003C LESS-THAN SIGN (<)
Parse error. Mark the pseudo-attribute as being in error.
EOF
Parse error. Stop parsing.
Anything else
Append the character to the pseudo-attribute's value.
After value state

Consume the next input character:

U+0009 CHARACTER TABULATION
U+000A LINE FEED (LF)
U+000D CARRIAGE RETURN (CR)
U+0020 SPACE
Switch to the before name state.
EOF
Stop parsing.
Anything else
Parse error. Reconsume the current input character in the name state.

3.1. Tokenizing entities

This section defines how to consume an entity.

The behavior depends on the identity of the next character (the one immediately after the U+0026 AMPERSAND character):

U+0023 NUMBER SIGN (#)

Consume the U+0023 NUMBER SIGN.

The behavior further depends on the character after the U+0023 NUMBER SIGN:

U+0078 LATIN SMALL LETTER X

Consume the U+0023 LATIN SMALL LETTER X.

Follow the steps below, but using the range of characters U+0030 DIGIT ZERO through to U+0039 DIGIT NINE, U+0061 LATIN SMALL LETTER A through to U+0066 LATIN SMALL LETTER F, and U+0041 LATIN CAPITAL LETTER A, through to U+0046 LATIN CAPITAL LETTER F (in other words, 0-9, a-f, and A-F).

When it comes to interpreting the number, interpret it as a hexadecimal number.

Anything else

Follow the steps below, but using the range of characters U+0030 DIGIT ZERO through to U+0039 DIGIT NINE (i.e. just 0-9).

When it comes to interpreting the number, interpret it as a decimal number.

Consume as many characters as match the range of characters given above.

If no characters match the range, then this is a parse error; mark the pseudo-attribute as being in error.

Otherwise, if the next character is a U+003B SEMICOLON, consume that too. If it isn't, there is a parse error; mark the pseudo-attribute as being in error.

If one or more characters match the range, then take them all and interpret the string of characters as a number (either hexadecimal or decimal as appropriate).

If the number is zero, if the number is higher than 0x10FFFF, or if it's one of the surrogate characters (characters in the range 0xD800 to 0xDFFF), then this is a parse error; mark the pseudo-attribute as being in error.

Otherwise, append the Unicode character whose code point is that number to the pseudo-attribute's value.

Anything else

Consume the maximum number of characters possible, with the consumed characters case-sensitively matching one of the identifiers in the first column of the following table:

Entity name Character
amp; U+0026
apos; U+0027
gt; U+003E
lt; U+003C
quot; U+0022

If no match can be made, then this is a parse error; mark the pseudo-attribute as being in error.

Otherwise, append the character corresponding to the entity name (as given by the second column of the table above) to the pseudo-attribute's value.

4. Processing of xml-stylesheet PIs

For xml-stylesheet PIs that are children of the Document object and are before the root element (if any), the UA must use the rules for parsing PIs with pseudo-attributes to obtain the pseudo-attributes.

The type pseudo-attribute represents a hint about the resource's MIME type, and must consist of a valid MIME type, optionally with parameters. The UA may opt to abort processing the PI if the MIME type given in the type pseudo-attribute is known to be unsupported. For the purposes of this pseudo-attribute, text/xsl must be assumed to be an XML MIME type.

The href pseudo-attribute gives the address of the resource. The pseudo-attribute must be present and must consist of an IRI reference. If the pseudo-attribute is present, the UA must begin to download the resource (subject to UA-specific downloading policies, e.g. security).

The title pseudo-attribute defines alternative style sheet sets. [CSSOM]

The alternate pseudo-attribute must either have the literal value yes or no. If the value is yes, then the referenced resource is an alternative style sheet. [CSSOM]

The media pseudo-attribute says which media the referenced resource applies to. The value must be a valid media query. [MQ] The UA must only apply the styles to views while their state match the listed media. [DOM2VIEWS]

A previous version of this specification had a charset pseudo-attribute, which has been dropped in this version.

If the MIME type of the resource given in the href pseudo-attribute is text/css (ignoring parameters), then the resource must be processed according to the rules in CSS. [CSS21]

If the PI is being processed because it was inserted by the XML processor, and the resource given in the href pseudo-attribute is XML (text/xml, application/xml, or any MIME type that ends with +xml (ignoring parameters)), and the root element (or, if the fragment identifier is present, the element that has that ID) is in the namespace http://www.w3.org/1999/XSL/Transform or has a version attribute in that namespace, then that document (or element) must be processed according to the rules in XSLT [XSLT]. In this case, the title, alternate and media pseudo-attributes do not apply.

References

...

Acknowledgments

Thanks to Anne van Kesteren, Daniel Bratell, George Chavchanidze, Ian Hickson, Jens Lindström, Maciej Stachowiak, and Philip Taylor for their useful and substantial comments.