Living Standard — Last Updated 16 January 2024
APIs for dynamically inserting markup into the document interact with the parser, and thus their behavior varies depending on whether they are used with HTML documents (and the HTML parser) or XML documents (and the XML parser).
Document
objects have a throw-on-dynamic-markup-insertion counter,
which is used in conjunction with the create an element for the token algorithm to
prevent custom element constructors from being
able to use document.open()
, document.close()
, and document.write()
when they are invoked by the parser.
Initially, the counter must be set to zero.
document = document.open()
Support in all current engines.
Causes the Document
to be replaced in-place, as if it was a new
Document
object, but reusing the previous object, which is then returned.
The resulting Document
has an HTML parser associated with it, which can be given
data to parse using document.write()
.
The method has no effect if the Document
is still being parsed.
Throws an "InvalidStateError
" DOMException
if the
Document
is an XML document.
Throws an "InvalidStateError
" DOMException
if the
parser is currently executing a custom element constructor.
window = document.open(url, name, features)
Works like the window.open()
method.
Document
objects have an active parser was aborted boolean, which is
used to prevent scripts from invoking the document.open()
and document.write()
methods (directly or indirectly)
after the document's active parser has been aborted. It is initially false.
The document open steps, given a document, are as follows:
If document is an XML document, then throw
an "InvalidStateError
" DOMException
exception.
If document's throw-on-dynamic-markup-insertion counter is greater
than 0, then throw an "InvalidStateError
"
DOMException
.
Let entryDocument be the entry global object's associated Document
.
If document's origin is not
same origin to entryDocument's origin, then throw a
"SecurityError
" DOMException
.
If document has an active parser whose script nesting level is greater than 0, then return document.
This basically causes document.open()
to
be ignored when it's called in an inline script found during parsing, while still letting it
have an effect when called from a non-parser task such as a timer callback or event handler.
Similarly, if document's unload counter is greater than 0, then return document.
This basically causes document.open()
to
be ignored when it's called from a beforeunload
, pagehide
, or unload
event
handler while the Document
is being unloaded.
If document's active parser was aborted is true, then return document.
This notably causes document.open()
to
be ignored if it is called after a navigation has started, but
only during the initial parse. See issue
#4723 for more background.
If document's node navigable is non-null and document's node navigable's ongoing navigation is a navigation ID, then stop loading document's node navigable.
For each shadow-including inclusive descendant node of document, erase all event listeners and handlers given node.
If document is the associated
Document
of document's relevant global object, then
erase all event listeners and handlers given document's relevant
global object.
Replace all with null within document.
If document is fully active, then:
Let newURL be a copy of entryDocument's URL.
If entryDocument is not document, then set newURL's fragment to null.
Run the URL and history update steps with document and newURL.
Set document's is initial about:blank
to
false.
If document's iframe load in progress flag is set, then set document's mute iframe load flag.
Set document to no-quirks mode.
Create a new HTML parser and associate it with document. This is a
script-created parser (meaning that it can be closed by the document.open()
and document.close()
methods, and that the tokenizer will wait for
an explicit call to document.close()
before emitting an
end-of-file token). The encoding confidence is
irrelevant.
Set the insertion point to point at just before the end of the input stream (which at this point will be empty).
Update the current document readiness of document to "loading
".
This causes a readystatechange
event to fire, but the event is actually unobservable to author code, because of the previous
step which erased all event listeners and
handlers that could observe it.
Return document.
The document open steps do not affect whether a Document
is ready for post-load tasks or completely loaded.
The open(unused1,
unused2)
method must return the result of running the document open
steps with this.
The unused1 and
unused2 arguments are ignored, but kept in the IDL to allow code that calls the
function with one or two arguments to continue working. They are necessary due to Web IDL
overload resolution algorithm rules, which would throw a TypeError
exception for such calls had the arguments not been there. whatwg/webidl issue #581 investigates
changing the algorithm to allow for their removal. [WEBIDL]
The open(url,
name, features)
method must run these steps:
If this is not fully active, then throw an
"InvalidAccessError
" DOMException
exception.
Return the result of running the window open steps with url, name, and features.
document.close()
Support in all current engines.
Closes the input stream that was opened by the document.open()
method.
Throws an "InvalidStateError
" DOMException
if the
Document
is an XML document.
Throws an "InvalidStateError
" DOMException
if the
parser is currently executing a custom element constructor.
The close()
method must run the following
steps:
If this is an XML document, then throw
an "InvalidStateError
" DOMException
.
If this's throw-on-dynamic-markup-insertion counter is greater
than zero, then throw an "InvalidStateError
"
DOMException
.
If there is no script-created parser associated with this, then return.
Insert an explicit "EOF" character at the end of the parser's input stream.
If this's pending parsing-blocking script is not null, then return.
Run the tokenizer, processing resulting tokens as they are emitted, and stopping when the tokenizer reaches the explicit "EOF" character or spins the event loop.
document.write()
document.write(...text)
Support in all current engines.
In general, adds the given string(s) to the Document
's input stream.
This method has very idiosyncratic behavior. In some cases, this method can
affect the state of the HTML parser while the parser is running, resulting in a DOM
that does not correspond to the source of the document (e.g. if the string written is the string
"<plaintext>
" or "<!--
"). In other cases,
the call can clear the current page first, as if document.open()
had been called. In yet more cases, the method
is simply ignored, or throws an exception. Users agents are explicitly allowed to avoid executing
script
elements inserted via this method. And to make matters even worse, the
exact behavior of this method can in some cases be dependent on network latency, which can lead to failures that are very hard to debug. For all these reasons, use
of this method is strongly discouraged.
Throws an "InvalidStateError
" DOMException
when
invoked on XML documents.
Throws an "InvalidStateError
" DOMException
if the
parser is currently executing a custom element constructor.
Document
objects have an ignore-destructive-writes counter, which is
used in conjunction with the processing of script
elements to prevent external
scripts from being able to use document.write()
to blow
away the document by implicitly calling document.open()
.
Initially, the counter must be set to zero.
The document write steps, given a Document
object document
and a string input, are as follows:
If document is an XML document, then throw
an "InvalidStateError
" DOMException
.
If document's throw-on-dynamic-markup-insertion counter is greater
than 0, then throw an "InvalidStateError
"
DOMException
.
If document's active parser was aborted is true, then return.
If the insertion point is undefined, then:
If document's unload counter is greater than 0 or document's ignore-destructive-writes counter is greater than 0, then return.
Run the document open steps with document.
Insert input into the input stream just before the insertion point.
If document's pending parsing-blocking script is null, then have the
HTML parser process input, one code point at a time, processing
resulting tokens as they are emitted, and stopping when the tokenizer reaches the insertion
point or when the processing of the tokenizer is aborted by the tree construction stage (this
can happen if a script
end tag token is emitted by the tokenizer).
If the document.write()
method was
called from script executing inline (i.e. executing because the parser parsed a set of
script
tags), then this is a reentrant invocation of the
parser. If the parser pause flag is set, the tokenizer will abort immediately
and no HTML will be parsed, per the tokenizer's parser pause
flag check.
The document.write(...)
method steps are to run the
document write steps with this and a string that is the concatenation
of all arguments passed.
document.writeln()
document.writeln(...text)
Support in all current engines.
Adds the given string(s) to the Document
's input stream, followed by a newline
character. If necessary, calls the open()
method
implicitly first.
Throws an "InvalidStateError
" DOMException
when
invoked on XML documents.
Throws an "InvalidStateError
" DOMException
if the
parser is currently executing a custom element constructor.
The document.writeln(...)
method steps are to run the
document write steps with this and a string that is the concatenation
of all arguments passed and U+000A LINE FEED.
Support in all current engines.
DOMParser
interfaceThe DOMParser
interface allows authors to create new Document
objects
by parsing strings, as either HTML or XML.
parser = new DOMParser()
Support in all current engines.
Constructs a new DOMParser
object.
document = parser.parseFromString(string, type)
Support in all current engines.
Parses string using either the HTML or XML parser, according to type,
and returns the resulting Document
. type can be "text/html
"
(which will invoke the HTML parser), or any of "text/xml
",
"application/xml
", "application/xhtml+xml
", or
"image/svg+xml
" (which will invoke the XML parser).
For the XML parser, if string cannot be parsed, then the returned
Document
will contain elements describing the resulting error.
Note that script
elements are not evaluated during parsing, and the resulting
document's encoding will always be
UTF-8. The document's URL will be
inherited from parser's relevant global object.
Values other than the above for type will cause a TypeError
exception
to be thrown.
The design of DOMParser
, as a class that needs to be constructed and
then have its parseFromString()
method
called, is an unfortunate historical artifact. If we were designing this functionality today it
would be a standalone function. For parsing HTML, the modern alternative is Document.parseHTMLUnsafe()
.
[Exposed =Window ]
interface DOMParser {
constructor ();
[NewObject ] Document
parseFromString (DOMString string , DOMParserSupportedType type );
};
enum DOMParserSupportedType {
" text/html " ,
" text/xml " ,
" application/xml " ,
" application/xhtml+xml " ,
" image/svg+xml "
};
The new DOMParser()
constructor
steps are to do nothing.
The parseFromString(string,
type)
method steps are:
Let document be a new Document
, whose content type is type and URL is this's relevant global object's associated Document
's URL.
The document's encoding will
be left as its default, of UTF-8. In particular, any XML declarations or
meta
elements found while parsing string will have no effect.
Switch on type:
text/html
"Parse HTML from a string given document and string.
Since document does not have a browsing context, scripting is disabled.
Create an XML parser parse, associated with document, and with XML scripting support disabled.
Parse string using parser.
If the previous step resulted in an XML well-formedness or XML namespace well-formedness error, then:
Assert: document has no child nodes.
Let root be the result of creating an
element given document, "parsererror
", and "http://www.mozilla.org/newlayout/xml/parsererror.xml
".
Optionally, add attributes or children to root to describe the nature of the parsing error.
Append root to document.
Return document.
To parse HTML from a string, given a document Document
and a
string html:
Set document's type to "html
".
Create an HTML parser parser, associated with document.
Place html into the input stream for parser. The encoding confidence is irrelevant.
Start parser and let it run until it has consumed all the characters just inserted into the input stream.
This might mutate the document's mode.
element.setHTMLUnsafe(html)
Parses html using the HTML parser, and replaces the children of element with the result. element provides context for the HTML parser.
shadowRoot.setHTMLUnsafe(html)
Parses html using the HTML parser, and replaces the children of shadowRoot with the result. shadowRoot's host provides context for the HTML parser.
doc = Document.parseHTMLUnsafe(html)
Parses html using the HTML parser, and returns the resulting
Document
.
Note that script
elements are not evaluated during parsing, and the resulting
document's encoding will always be
UTF-8. The document's URL will be
about:blank
.
These methods perform no sanitization to remove potentially-dangerous elements
and attributes like script
or event handler content attributes.
partial interface Element {
[CEReactions ] undefined setHTMLUnsafe (DOMString html );
};
partial interface ShadowRoot {
[CEReactions ] undefined setHTMLUnsafe (DOMString html );
};
Element
's setHTMLUnsafe(html)
method steps
are:
Let target be this's template contents if
this is a template
element; otherwise this.
Unsafely set HTML given target, this, and html.
ShadowRoot
's setHTMLUnsafe(html)
method steps
are to unsafely set HTML given this, this's shadow host, and html.
To unsafely set HTML, given an Element
or DocumentFragment
target, an Element
contextElement, and a string
html:
Let newChildren be the result of the HTML fragment parsing algorithm given contextElement, html, and true.
Let fragment be a new DocumentFragment
whose node
document is contextElement's node document.
For each node in newChildren, append node to fragment.
Replace all with fragment within target.
The static parseHTMLUnsafe(html)
method steps are:
Let document be a new Document
, whose content type is "text/html
".
Since document does not have a browsing context, scripting is disabled.
Set document's allow declarative shadow roots to true.
Parse HTML from a string given document and html.
Return document.