org.fife.ui.rsyntaxtextarea.modes
Class HTMLTokenMaker

java.lang.Object
  extended by org.fife.ui.rsyntaxtextarea.TokenMakerBase
      extended by org.fife.ui.rsyntaxtextarea.AbstractJFlexTokenMaker
          extended by org.fife.ui.rsyntaxtextarea.modes.AbstractMarkupTokenMaker
              extended by org.fife.ui.rsyntaxtextarea.modes.HTMLTokenMaker
All Implemented Interfaces:
TokenMaker

public class HTMLTokenMaker
extends AbstractMarkupTokenMaker

Scanner for HTML 5 files. This implementation was created using JFlex 1.4.1; however, the generated file was modified for performance. Memory allocation needs to be almost completely removed to be competitive with the handwritten lexers (subclasses of AbstractTokenMaker, so this class has been modified so that Strings are never allocated (via yytext()), and the scanner never has to worry about refilling its buffer (needlessly copying chars around). We can achieve this because RText always scans exactly 1 line of tokens at a time, and hands the scanner this line as an array of characters (a Segment really). Since tokens contain pointers to char arrays instead of Strings holding their contents, there is no need for allocating new memory for Strings.

The actual algorithm generated for scanning has, of course, not been modified.

If you wish to regenerate this file yourself, keep in mind the following:


Field Summary
static int COMMENT
           
static int CSS
           
static int CSS_C_STYLE_COMMENT
           
static int CSS_CHAR_LITERAL
           
static int CSS_PROPERTY
           
static int CSS_STRING
           
static int CSS_VALUE
           
static int DTD
           
static int INATTR_DOUBLE
           
static int INATTR_DOUBLE_SCRIPT
           
static int INATTR_DOUBLE_STYLE
           
static int INATTR_SINGLE
           
static int INATTR_SINGLE_SCRIPT
          lexical states
static int INATTR_SINGLE_STYLE
           
static int INTAG
           
static int INTAG_CHECK_TAG_NAME
           
static int INTAG_SCRIPT
           
static int INTAG_STYLE
           
static int INTERNAL_ATTR_DOUBLE
          Type specific to XMLTokenMaker denoting a line ending with an unclosed double-quote attribute.
static int INTERNAL_ATTR_DOUBLE_QUOTE_SCRIPT
          Token type specifying we're in a double-qouted attribute in a script tag.
static int INTERNAL_ATTR_DOUBLE_QUOTE_STYLE
          Token type specifying we're in a double-qouted attribute in a style tag.
static int INTERNAL_ATTR_SINGLE
          Type specific to XMLTokenMaker denoting a line ending with an unclosed single-quote attribute.
static int INTERNAL_ATTR_SINGLE_QUOTE_SCRIPT
          Token type specifying we're in a single-qouted attribute in a script tag.
static int INTERNAL_ATTR_SINGLE_QUOTE_STYLE
          Token type specifying we're in a single-qouted attribute in a style tag.
static int INTERNAL_CSS
          Internal type denoting a line ending in CSS.
static int INTERNAL_CSS_CHAR
          Internal type denoting line ending in a CSS single-quote string.
static int INTERNAL_CSS_MLC
          Internal type denoting line ending in a CSS multi-line comment.
static int INTERNAL_CSS_PROPERTY
          Internal type denoting a line ending in a CSS property.
static int INTERNAL_CSS_STRING
          Internal type denoting line ending in a CSS double-quote string.
static int INTERNAL_CSS_VALUE
          Internal type denoting a line ending in a CSS property value.
static int INTERNAL_IN_JS
          Token type specifying we're in JavaScript.
static int INTERNAL_IN_JS_CHAR_INVALID
          Token type specifying we're in an invalid multi-line JS single-quoted string.
static int INTERNAL_IN_JS_CHAR_VALID
          Token type specifying we're in a valid multi-line JS single-quoted string.
static int INTERNAL_IN_JS_MLC
          Token type specifying we're in a JavaScript multiline comment.
static int INTERNAL_IN_JS_STRING_INVALID
          Token type specifying we're in an invalid multi-line JS string.
static int INTERNAL_IN_JS_STRING_VALID
          Token type specifying we're in a valid multi-line JS string.
static int INTERNAL_INTAG
          Token type specific to HTMLTokenMaker; this signals that the user has ended a line with an unclosed HTML tag; thus a new line is beginning still inside of the tag.
static int INTERNAL_INTAG_SCRIPT
          Token type specific to HTMLTokenMaker; this signals that the user has ended a line with an unclosed <script> tag.
static int INTERNAL_INTAG_STYLE
          Token type specific to HTMLTokenMaker; this signals that the user has ended a line with an unclosed <style> tag.
static int JAVASCRIPT
           
static int JS_CHAR
           
static int JS_EOL_COMMENT
           
static int JS_MLC
           
static int JS_STRING
           
static int PI
           
static int YYEOF
          This character denotes the end of file
static int YYINITIAL
           
 
Fields inherited from class org.fife.ui.rsyntaxtextarea.AbstractJFlexTokenMaker
offsetShift, s, start
 
Fields inherited from class org.fife.ui.rsyntaxtextarea.TokenMakerBase
currentToken, firstToken, previousToken
 
Constructor Summary
HTMLTokenMaker()
          Constructor.
HTMLTokenMaker(java.io.InputStream in)
          Creates a new scanner.
HTMLTokenMaker(java.io.Reader in)
          Creates a new scanner There is also a java.io.InputStream version of this constructor.
 
Method Summary
 void addToken(char[] array, int start, int end, int tokenType, int startOffset)
          Adds the token specified to the current linked list of tokens.
protected  OccurrenceMarker createOccurrenceMarker()
          Returns the occurrence marker to use for this token maker.
 boolean getCompleteCloseTags()
          Sets whether markup close tags should be completed.
 boolean getCurlyBracesDenoteCodeBlocks(int languageIndex)
          Returns whether this programming language uses curly braces ('{' and '}') to denote code blocks.
 java.lang.String[] getLineCommentStartAndEnd(int languageIndex)
          Returns the text to place at the beginning and end of a line to "comment" it in this programming language.
 boolean getMarkOccurrencesOfTokenType(int type)
          Returns Token.MARKUP_TAG_NAME.
 boolean getShouldIndentNextLineAfter(Token token)
          Overridden to handle newlines in JS and CSS differently than those in markup.
 Token getTokenList(javax.swing.text.Segment text, int initialTokenType, int startOffset)
          Returns the first token in the linked list of tokens generated from text.
static void setCompleteCloseTags(boolean complete)
          Sets whether markup close tags should be completed.
 void yybegin(int newState)
          Enters a new lexical state
 char yycharat(int pos)
          Returns the character at position pos from the matched text.
 void yyclose()
          Closes the input stream.
 int yylength()
          Returns the length of the matched text region.
 Token yylex()
          Resumes scanning until the next regular expression is matched, the end of input is encountered or an I/O-Error occurs.
 void yypushback(int number)
          Pushes the specified amount of characters back into the input stream.
 void yyreset(java.io.Reader reader)
          Resets the scanner to read from a new input stream.
 int yystate()
          Returns the current lexical state.
 java.lang.String yytext()
          Returns the text matched by the current regular expression.
 
Methods inherited from class org.fife.ui.rsyntaxtextarea.modes.AbstractMarkupTokenMaker
isMarkupLanguage
 
Methods inherited from class org.fife.ui.rsyntaxtextarea.AbstractJFlexTokenMaker
yybegin
 
Methods inherited from class org.fife.ui.rsyntaxtextarea.TokenMakerBase
addNullToken, addToken, addToken, getClosestStandardTokenTypeForInternalType, getInsertBreakAction, getLanguageIndex, getLastTokenTypeOnLine, getOccurrenceMarker, isIdentifierChar, resetTokenList, setLanguageIndex
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

YYEOF

public static final int YYEOF
This character denotes the end of file

See Also:
Constant Field Values

INATTR_SINGLE_SCRIPT

public static final int INATTR_SINGLE_SCRIPT
lexical states

See Also:
Constant Field Values

JS_CHAR

public static final int JS_CHAR
See Also:
Constant Field Values

CSS_STRING

public static final int CSS_STRING
See Also:
Constant Field Values

JS_MLC

public static final int JS_MLC
See Also:
Constant Field Values

CSS_CHAR_LITERAL

public static final int CSS_CHAR_LITERAL
See Also:
Constant Field Values

INTAG_SCRIPT

public static final int INTAG_SCRIPT
See Also:
Constant Field Values

CSS_PROPERTY

public static final int CSS_PROPERTY
See Also:
Constant Field Values

CSS_C_STYLE_COMMENT

public static final int CSS_C_STYLE_COMMENT
See Also:
Constant Field Values

CSS

public static final int CSS
See Also:
Constant Field Values

CSS_VALUE

public static final int CSS_VALUE
See Also:
Constant Field Values

COMMENT

public static final int COMMENT
See Also:
Constant Field Values

INATTR_DOUBLE_SCRIPT

public static final int INATTR_DOUBLE_SCRIPT
See Also:
Constant Field Values

PI

public static final int PI
See Also:
Constant Field Values

JAVASCRIPT

public static final int JAVASCRIPT
See Also:
Constant Field Values

INTAG

public static final int INTAG
See Also:
Constant Field Values

INTAG_CHECK_TAG_NAME

public static final int INTAG_CHECK_TAG_NAME
See Also:
Constant Field Values

INATTR_SINGLE_STYLE

public static final int INATTR_SINGLE_STYLE
See Also:
Constant Field Values

DTD

public static final int DTD
See Also:
Constant Field Values

JS_EOL_COMMENT

public static final int JS_EOL_COMMENT
See Also:
Constant Field Values

INATTR_DOUBLE_STYLE

public static final int INATTR_DOUBLE_STYLE
See Also:
Constant Field Values

INATTR_SINGLE

public static final int INATTR_SINGLE
See Also:
Constant Field Values

YYINITIAL

public static final int YYINITIAL
See Also:
Constant Field Values

INATTR_DOUBLE

public static final int INATTR_DOUBLE
See Also:
Constant Field Values

JS_STRING

public static final int JS_STRING
See Also:
Constant Field Values

INTAG_STYLE

public static final int INTAG_STYLE
See Also:
Constant Field Values

INTERNAL_ATTR_DOUBLE

public static final int INTERNAL_ATTR_DOUBLE
Type specific to XMLTokenMaker denoting a line ending with an unclosed double-quote attribute.

See Also:
Constant Field Values

INTERNAL_ATTR_SINGLE

public static final int INTERNAL_ATTR_SINGLE
Type specific to XMLTokenMaker denoting a line ending with an unclosed single-quote attribute.

See Also:
Constant Field Values

INTERNAL_INTAG

public static final int INTERNAL_INTAG
Token type specific to HTMLTokenMaker; this signals that the user has ended a line with an unclosed HTML tag; thus a new line is beginning still inside of the tag.

See Also:
Constant Field Values

INTERNAL_INTAG_SCRIPT

public static final int INTERNAL_INTAG_SCRIPT
Token type specific to HTMLTokenMaker; this signals that the user has ended a line with an unclosed <script> tag.

See Also:
Constant Field Values

INTERNAL_ATTR_DOUBLE_QUOTE_SCRIPT

public static final int INTERNAL_ATTR_DOUBLE_QUOTE_SCRIPT
Token type specifying we're in a double-qouted attribute in a script tag.

See Also:
Constant Field Values

INTERNAL_ATTR_SINGLE_QUOTE_SCRIPT

public static final int INTERNAL_ATTR_SINGLE_QUOTE_SCRIPT
Token type specifying we're in a single-qouted attribute in a script tag.

See Also:
Constant Field Values

INTERNAL_INTAG_STYLE

public static final int INTERNAL_INTAG_STYLE
Token type specific to HTMLTokenMaker; this signals that the user has ended a line with an unclosed <style> tag.

See Also:
Constant Field Values

INTERNAL_ATTR_DOUBLE_QUOTE_STYLE

public static final int INTERNAL_ATTR_DOUBLE_QUOTE_STYLE
Token type specifying we're in a double-qouted attribute in a style tag.

See Also:
Constant Field Values

INTERNAL_ATTR_SINGLE_QUOTE_STYLE

public static final int INTERNAL_ATTR_SINGLE_QUOTE_STYLE
Token type specifying we're in a single-qouted attribute in a style tag.

See Also:
Constant Field Values

INTERNAL_IN_JS

public static final int INTERNAL_IN_JS
Token type specifying we're in JavaScript.

See Also:
Constant Field Values

INTERNAL_IN_JS_MLC

public static final int INTERNAL_IN_JS_MLC
Token type specifying we're in a JavaScript multiline comment.

See Also:
Constant Field Values

INTERNAL_IN_JS_STRING_INVALID

public static final int INTERNAL_IN_JS_STRING_INVALID
Token type specifying we're in an invalid multi-line JS string.

See Also:
Constant Field Values

INTERNAL_IN_JS_STRING_VALID

public static final int INTERNAL_IN_JS_STRING_VALID
Token type specifying we're in a valid multi-line JS string.

See Also:
Constant Field Values

INTERNAL_IN_JS_CHAR_INVALID

public static final int INTERNAL_IN_JS_CHAR_INVALID
Token type specifying we're in an invalid multi-line JS single-quoted string.

See Also:
Constant Field Values

INTERNAL_IN_JS_CHAR_VALID

public static final int INTERNAL_IN_JS_CHAR_VALID
Token type specifying we're in a valid multi-line JS single-quoted string.

See Also:
Constant Field Values

INTERNAL_CSS

public static final int INTERNAL_CSS
Internal type denoting a line ending in CSS.

See Also:
Constant Field Values

INTERNAL_CSS_PROPERTY

public static final int INTERNAL_CSS_PROPERTY
Internal type denoting a line ending in a CSS property.

See Also:
Constant Field Values

INTERNAL_CSS_VALUE

public static final int INTERNAL_CSS_VALUE
Internal type denoting a line ending in a CSS property value.

See Also:
Constant Field Values

INTERNAL_CSS_STRING

public static final int INTERNAL_CSS_STRING
Internal type denoting line ending in a CSS double-quote string. The state to return to is embedded in the actual end token type.

See Also:
Constant Field Values

INTERNAL_CSS_CHAR

public static final int INTERNAL_CSS_CHAR
Internal type denoting line ending in a CSS single-quote string. The state to return to is embedded in the actual end token type.

See Also:
Constant Field Values

INTERNAL_CSS_MLC

public static final int INTERNAL_CSS_MLC
Internal type denoting line ending in a CSS multi-line comment. The state to return to is embedded in the actual end token type.

See Also:
Constant Field Values
Constructor Detail

HTMLTokenMaker

public HTMLTokenMaker()
Constructor. This must be here because JFlex does not generate a no-parameter constructor.


HTMLTokenMaker

public HTMLTokenMaker(java.io.Reader in)
Creates a new scanner There is also a java.io.InputStream version of this constructor.

Parameters:
in - the java.io.Reader to read input from.

HTMLTokenMaker

public HTMLTokenMaker(java.io.InputStream in)
Creates a new scanner. There is also java.io.Reader version of this constructor.

Parameters:
in - the java.io.Inputstream to read input from.
Method Detail

addToken

public void addToken(char[] array,
                     int start,
                     int end,
                     int tokenType,
                     int startOffset)
Adds the token specified to the current linked list of tokens.

Specified by:
addToken in interface TokenMaker
Overrides:
addToken in class TokenMakerBase
Parameters:
array - The character array.
start - The starting offset in the array.
end - The ending offset in the array.
tokenType - The token's type.
startOffset - The offset in the document at which this token occurs.

createOccurrenceMarker

protected OccurrenceMarker createOccurrenceMarker()
Returns the occurrence marker to use for this token maker. Subclasses can override to use different implementations.

Overrides:
createOccurrenceMarker in class TokenMakerBase
Returns:
The occurrence marker to use.

getCompleteCloseTags

public boolean getCompleteCloseTags()
Sets whether markup close tags should be completed. You might not want this to be the case, since some tags in standard HTML aren't usually closed.

Specified by:
getCompleteCloseTags in class AbstractMarkupTokenMaker
Returns:
Whether closing markup tags are completed.
See Also:
setCompleteCloseTags(boolean)

getCurlyBracesDenoteCodeBlocks

public boolean getCurlyBracesDenoteCodeBlocks(int languageIndex)
Description copied from class: TokenMakerBase
Returns whether this programming language uses curly braces ('{' and '}') to denote code blocks. The default implementation returns false; subclasses can override this method if necessary.

Specified by:
getCurlyBracesDenoteCodeBlocks in interface TokenMaker
Overrides:
getCurlyBracesDenoteCodeBlocks in class TokenMakerBase
Parameters:
languageIndex - The language index at the offset in question. Since some TokenMakers effectively have nested languages (such as JavaScript in HTML), this parameter tells the TokenMaker what sub-language to look at.
Returns:
Whether curly braces denote code blocks.

getLineCommentStartAndEnd

public java.lang.String[] getLineCommentStartAndEnd(int languageIndex)
Returns the text to place at the beginning and end of a line to "comment" it in this programming language.

Specified by:
getLineCommentStartAndEnd in interface TokenMaker
Overrides:
getLineCommentStartAndEnd in class AbstractMarkupTokenMaker
Parameters:
languageIndex - The language index at the offset in question. Since some TokenMakers effectively have nested languages (such as JavaScript in HTML), this parameter tells the TokenMaker what sub-language to look at.
Returns:
The start and end strings to add to a line to "comment" it out. A null value for either means there is no string to add for that part. A value of null for the array means this language does not support commenting/uncommenting lines.

getMarkOccurrencesOfTokenType

public boolean getMarkOccurrencesOfTokenType(int type)
Returns Token.MARKUP_TAG_NAME.

Specified by:
getMarkOccurrencesOfTokenType in interface TokenMaker
Overrides:
getMarkOccurrencesOfTokenType in class TokenMakerBase
Parameters:
type - The token type.
Returns:
Whether tokens of this type should have "mark occurrences" enabled.

getShouldIndentNextLineAfter

public boolean getShouldIndentNextLineAfter(Token token)
Overridden to handle newlines in JS and CSS differently than those in markup.

Specified by:
getShouldIndentNextLineAfter in interface TokenMaker
Overrides:
getShouldIndentNextLineAfter in class TokenMakerBase
Parameters:
token - The token the previous line ends with.
Returns:
Whether the next line should be indented.

getTokenList

public Token getTokenList(javax.swing.text.Segment text,
                          int initialTokenType,
                          int startOffset)
Returns the first token in the linked list of tokens generated from text. This method must be implemented by subclasses so they can correctly implement syntax highlighting.

Parameters:
text - The text from which to get tokens.
initialTokenType - The token type we should start with.
startOffset - The offset into the document at which text starts.
Returns:
The first Token in a linked list representing the syntax highlighted text.

setCompleteCloseTags

public static void setCompleteCloseTags(boolean complete)
Sets whether markup close tags should be completed. You might not want this to be the case, since some tags in standard HTML aren't usually closed.

Parameters:
complete - Whether closing markup tags are completed.
See Also:
getCompleteCloseTags()

yyreset

public final void yyreset(java.io.Reader reader)
Resets the scanner to read from a new input stream. Does not close the old reader. All internal variables are reset, the old input stream cannot be reused (internal buffer is discarded and lost). Lexical state is set to YY_INITIAL.

Parameters:
reader - the new input stream

yyclose

public final void yyclose()
                   throws java.io.IOException
Closes the input stream.

Throws:
java.io.IOException

yystate

public final int yystate()
Returns the current lexical state.


yybegin

public final void yybegin(int newState)
Enters a new lexical state

Specified by:
yybegin in class AbstractJFlexTokenMaker
Parameters:
newState - the new lexical state

yytext

public final java.lang.String yytext()
Returns the text matched by the current regular expression.


yycharat

public final char yycharat(int pos)
Returns the character at position pos from the matched text. It is equivalent to yytext().charAt(pos), but faster

Parameters:
pos - the position of the character to fetch. A value from 0 to yylength()-1.
Returns:
the character at position pos

yylength

public final int yylength()
Returns the length of the matched text region.


yypushback

public void yypushback(int number)
Pushes the specified amount of characters back into the input stream. They will be read again by then next call of the scanning method

Parameters:
number - the number of characters to be read again. This number must not be greater than yylength()!

yylex

public Token yylex()
            throws java.io.IOException
Resumes scanning until the next regular expression is matched, the end of input is encountered or an I/O-Error occurs.

Returns:
the next token
Throws:
java.io.IOException - if any I/O-Error occurs