Thursday, 29 November 2012

WebVTT Syntax Rules Do Not Apply To Parser

Let's see what the World Wide Web Consortium (W3C) documentation and community say. I'll focus on WebVTT as much as possible. I'm sure I can find more and I've been told this twice on IRC.

Quotes

When writing something that reads WebVTT files, be very sure to parse it as specified by the parser--*not* by reading the syntax and coming up with your own parsing algorithm.

Glenn Maynard
http://lists.w3.org/Archives/Public/public-texttracks/2012Jul/0011.html


[Syntax rules] are requirements for writing, not for parsing. Requirements in that section don't apply to you.

Simon Pieters
http://lists.w3.org/Archives/Public/public-texttracks/2012Nov/0017.html


"It's a bit unusual for a standard to specify the parsing algorithm, but I can understand why."
It's not unusual for modern specs.  It's a much more dependable way of getting to consistent behavior than only specifying a format.

Glenn Maynard
http://lists.w3.org/Archives/Public/public-texttracks/2012Jul/0014.html


However, when we go to the parsing section of the spec, there is no step in the parser that makes sure that cues that are out of time order are ignored. This means that an implemented parser will pick up such cues and enter them into the list of cues to be used at the time that they are relevant.

(Note: This infers that implementations are independent of syntax requirements)

Silvia Pfeiffer
https://www.w3.org/Bugs/Public/show_bug.cgi?id=15632


Conformance requirements phrased as algorithms or specific steps may be implemented in any manner, so long as the end result is equivalent.

http://dev.w3.org/html5/spec//infrastructure.html#conformance-requirements

Wednesday, 28 November 2012

WebVTT Unit Test Review

The C++ wrapper did not have a function to check if the cue setting line was set to "auto". I added it.
isLinePositionAuto()

I changed almost all the comments for the unit tests. They references the syntax rules of the specifications and should be referencing the parser rules.

If a malformed setting is encountered it is ignored. If a setting is used more than once the right most setting (last in the token list) will be used.

I thought for a while that certain tests which actually have to do with the parsing rules before it reaches the matching keyword algorithm would be generic and just make sure nothing gets changed. However this would mean asserting many things in the test, so I thought it better to divide them up and only test one thing at a time.

Cue Generic Settings (csgeneric_unittest.cpp)

Changes

  • MultipleCueSettings2Mixed
    • Used "cue-settings/2-cue-settings-0x20.vtt".
    • Changed to "cue-settings/2-cue-settings-mixed.vtt".
  • SameCueSetting
    • It does not throw an error (WEBVTT_VERTICAL_ALREADY_SET).
    • The settings are parsed in order so that the right most (last in the list) one is used.
    • Vertical will be right to left.
  • BadDelimiter
    • It does not throw an error (WEBVTT_EXPECTED_WHITESPACE).
    • WEBVTT_EXPECTED_WHITESPACE is wrong anyway, it should have been WEBVTT_VERTICAL_BAD_VALUE because the parser takes everything to the right of the first colon in a setting.
    • The parser should try to use "vertical" as keyword and "lr;line:50%" as value.
    • It should skip the malformed setting.
  • BadDelimiter2
    • It does not throw an error (WEBVTT_EXPECTED_WHITESPACE).
    • The parser does not require a spacing character between cue end time timestamp and settings. However there cannot be four digits in a row after the cue end time timestamp decimal.
    • It should skip the malformed setting ("^line").
    • Line should be "auto" because that is default.
  • NoDelimiter
    • It does not throw an error (WEBVTT_EXPECTED_WHITESPACE).
    • The parser does not require a spacing character between cue end time timestamp and settings.
    • It should parse the line setting normally.

Added

  • DigitDelimiter
    • Essentially the same as making sure the fractions of a second in a timestamp does not have 4 digits.
    • Since parser does note require a delimiter between cue end time timestamp and settings, a possible malformed setting could have a keyword that starts with a digit.
    • This should throw the error WEBVTT_MALFORMED_TIMESTAMP.

Cue Setting Align (csalign_unittest.cpp)

The default value for align is middle. If a malformed setting is encountered it is ignored. If a setting is used more than once the right most setting (last in the token list) will be used.

Changed

  • NoKeyword
    • It does not throw an error (WEBVTT_INVALID_CUESETTING).
    • It should skip the malformed setting.
    • Align should be middle because that is default.
  • BadValue
    • It does not throw an error (WEBVTT_ALIGN_BAD_VALUE).
    • It should skip the malformed setting.
    • Align should be middle because that is default.
  • BadDelimiter
    • It does not throw an error (WEBVTT_ALIGN_BAD_VALUE).
    • It should skip the malformed setting.
    • Align should be middle because that is default.
  • NoValue
    • It does not throw an error (WEBVTT_INVALID_CUESETTING).
    • It should skip the malformed setting.
    • Align should be middle because that is default.
  • NoDelimiter
    • It does not throw an error (WEBVTT_INVALID_CUESETTING).
    • It should skip the malformed setting.
    • Align should be middle because that is default.

Added

  • UppercaseKeyword
    • Setting keywords are case-sensitive and must be lowercase.
    • Align should be middle because that is default.
  • UppercaseValue
    • Setting values are case-sensitive and must be lowercase.
    • Align should be middle because that is default.
  • BadKeyword
    • Make sure align is "middle" when no valid align setting.

Cue Setting Line (csline_unittest.cpp)

The setting position affect two thing: the line position and the snap-to-lines flag. The default value for line position is the string "auto". The default value for snap-to-lines is true.

I made sure error count was 0 for all tests that should pass.

Changed

  • SingleDigitNegativeLowBoundary
    • Negative zero equals zero so zero is used.
  • DoubleDigitNegativeLowBoundary
    • Negative zero equals zero so zero is used.
  • BadValue
    • It does not throw an error (WEBVTT_LINE_BAD_VALUE).
    • It should skip the malformed setting.
    • Line should be "auto" and snap-to-lines should be true because that is default.
  • BadDelimiter
    • It does not throw an error (WEBVTT_INVALID_CUESETTING).
    • It should skip the malformed setting.
    • Line should be "auto" and snap-to-lines should be true because that is default.
  • BadKeyword
    • It does not throw an error (WEBVTT_INVALID_CUESETTING).
    • It should skip the malformed setting.
    • Line should be "auto" and snap-to-lines should be true because that is default.
  • BadValueSuffix
    • Actually test two settings.
    • It does not throw an error (WEBVTT_LINE_BAD_VALUE).
    • It should skip the malformed settings.
    • Line should be "auto" and snap-to-lines should be true because that is default.
  • WhitespaceDelimiter
    • Actually test two settings.
    • It does not throw an error (WEBVTT_INVALID_CUESETTING).
    • It should skip the malformed settings.
    • Line should be "auto" and snap-to-lines should be true because that is default.
  • BadWhitespaceBeforeDelimiter
    • Actually test two settings.
    • It does not throw an error (WEBVTT_INVALID_CUESETTING).
    • It should skip the malformed settings.
    • Line should be "auto" and snap-to-lines should be true because that is default.
  • BadWhitespaceAfterDelimiter
    • Actually test two settings.
    • It does not throw an error (WEBVTT_INVALID_CUESETTING).
    • It should skip the malformed settings.
    • Line should be "auto" and snap-to-lines should be true because that is default.

Added

  • ManyDigit
    • More than two digits is allowed.
  • ManyDigitHighBoundary
    • More than two digits is allowed.
    • Max on an int is at least 32767.
  • ManyDigitLowBoundary
    • More than two digits is  allowed.
    • 00000
  • ManyDigitNegative
    • More than two digits is allowed.
  • ManyDigitNegativeHighBoundary
    • More than two digits is allowed.
    • Min on an int is at least -32767.
  • ManyDigitNegativeLowBoundary
    • More than two digits is allowed.
    • 00000
  • ManyDigitPercentage
    • More than two digits are allowed.
    • 055%
  • ManyDigitPercentageHighBoundary
    • More than two digits are allowed.
    • 100%
  • ManyDigitPercentageLowBoundary
    • More than two digits are allowed.
    • 000%
  • NoKeyword
    • It should skip the malformed setting.
    • Line should be "auto" and snap-to-lines should be true because that is default.
  • NoValue
    • It should skip the malformed setting.
    • Line should be "auto" and snap-to-lines should be true because that is default.
  • UppercaseKeyword
    • Setting keywords are case-sensitive and must be lowercase.
    • Align should be middle because that is default.
  • NoDelimiter
    • It should skip the malformed setting.
    • Line should be "auto" and snap-to-lines should be true because that is default.
  • PercentNegative
    • It should skip the malformed setting.
    • Position should be 50 because that is default.
  • PercentOver100
    • It should skip the malformed setting.
    • Position should be 50 because that is default.

Cue Setting Position (csposition_unittest.cpp)

The default value for text position is 50%. The max is 100%. The min is 0%.

Changed

  • TripleDigitPercentage
    • Max value is 100%.
    • Changed value to 055%.
  • TripleDigitPercentageHighBoundary
    • Max value is 100%.
    • Changed value to 100%.
  • DoubleDigitPercentageLowBoundary
    • 00 is same as 0, so 0 is used.
  • TripleDigitPercentageLowBoundary
    • 000 is same as 0, so 0 is used.
  • NoDelimiter
    • It does not throw an error (WEBVTT_INVALID_CUESETTING).
    • It should skip the malformed setting.
    • Position should be 50 because that is default.
  • NoKeyword
    • It does not throw an error (WEBVTT_INVALID_CUESETTING).
    • It should skip the malformed setting.
    • Position should be 50 because that is default.
  • NoPercentSign
    • It does not throw an error (WEBVTT_POSITION_BAD_VALUE).
    • It should skip the malformed setting.
    • Position should be 50 because that is default.
  • BadDelimiter
    • It does not throw an error (WEBVTT_INVALID_CUESETTING).
    • It should skip the malformed setting.
    • Position should be 50 because that is default.
  • AsciiDigitBeyondHighBoundary
    • It does not throw an error (WEBVTT_POSITION_BAD_VALUE).
    • It should skip the malformed setting.
    • Position should be 50 because that is default.
  • AsciiDigitBeyondLowBoundary
    • It does not throw an error (WEBVTT_POSITION_BAD_VALUE).
    • It should skip the malformed setting.
    • Position should be 50 because that is default.

Added

  • NoValue
    • It should skip the malformed setting.
    • Position should be 50 because that is default.
  • PercentNegative
    • It should skip the malformed setting.
    • Position should be 50 because that is default.
  • PercentOver100
    • It should skip the malformed setting.
    • Position should be 50 because that is default.
  • UppercaseKeyword
    • Setting keywords are case-sensitive and must be lowercase.
    • Position should be 50 because that is default.
  • BadKeyword
    • It does not throw an error (WEBVTT_INVALID_CUESETTING).
    • It should skip the malformed setting.
    • Position should be 50 because that is default.

Removed

  • tc4015-cue_settings_line_08_bad.test
    • Not used.

Cue Setting Size (cssize_unittest.cpp)

The default size is 100%. The max is 100% and the min is 0%.

The tests are identical to the cue setting position test, so I copied those tests and edited as needed. Most of it  has changed and the changes are similar to the cue setting position changes and additions.

Cue Setting Vertical (csvertical_unittest.cpp)

The default writing direction is horizontal.

Changed


  • BadKeyword
    • It does not throw an error (WEBVTT_INVALID_CUESETTING).
    • It should skip the malformed setting.
    • Writing direction should be horizontal because that is default.
  • BadDelimiter
    • It does not throw an error (WEBVTT_INVALID_CUESETTING).
    • It should skip the malformed setting.
    • Writing direction should be horizontal because that is default.
  • BadWhitespaceBeforeDelimiter
    • Actually test two settings.
    • It does not throw an error (WEBVTT_INVALID_CUESETTING).
    • It should skip the malformed settings.
    • Writing direction should be horizontal because that is default.
  • BadWhitespaceAfterDelimiter
    • Actually test two settings.
    • It does not throw an error (WEBVTT_INVALID_CUESETTING).
    • It should skip the malformed settings.
    • Writing direction should be horizontal because that is default.

Added


  • UppercaseKeyword
    • Setting keywords are case-sensitive and must be lowercase.
    • Writing direction should be horizontal because that is default.
  • UppercaseValue
    • Setting values are case-sensitive and must be lowercase.
    • Writing direction should be horizontal because that is default.
  • NoValue
    • It should skip the malformed setting.
    • Writing direction should be horizontal because that is default.
  • NoKeyword
    • It should skip the malformed setting.
    • Writing direction should be horizontal because that is default.
  • NoDelimiter
    • It should skip the malformed setting.
    • Writing direction should be horizontal because that is default.

Failing Tests

[  FAILED  ] CueSettingSize.NoDelimiter
[  FAILED  ] CueSettingSize.NoKeyword
[  FAILED  ] CueSettingSize.NoValue
[  FAILED  ] CueSettingSize.NoPercentSign
[  FAILED  ] CueSettingSize.BadDelimiter
[  FAILED  ] CueSettingSize.BadKeyword
[  FAILED  ] CueSettingSize.AsciiDigitBeyondHighBoundary
[  FAILED  ] CueSettingSize.AsciiDigitBeyondLowBoundary
[  FAILED  ] CueSettingSize.PercentNegative
[  FAILED  ] CueSettingSize.PercentOver100
[  FAILED  ] CueSettingSize.UppercaseKeyword

[  FAILED  ] CueSettingPosition.NoDelimiter
[  FAILED  ] CueSettingPosition.NoKeyword
[  FAILED  ] CueSettingPosition.NoValue
[  FAILED  ] CueSettingPosition.NoPercentSign
[  FAILED  ] CueSettingPosition.BadDelimiter
[  FAILED  ] CueSettingPosition.BadKeyword
[  FAILED  ] CueSettingPosition.AsciiDigitBeyondHighBoundary
[  FAILED  ] CueSettingPosition.AsciiDigitBeyondLowBoundary
[  FAILED  ] CueSettingPosition.PercentNegative
[  FAILED  ] CueSettingPosition.PercentOver100
[  FAILED  ] CueSettingPosition.UppercaseKeyword

[  FAILED  ] CueSettingVertical.BadKeyword
[  FAILED  ] CueSettingVertical.BadDelimiter
[  FAILED  ] CueSettingVertical.BadValue
[  FAILED  ] CueSettingVertical.BadWhitespaceBeforeDelimiter
[  FAILED  ] CueSettingVertical.BadWhitespaceAfterDelimiter
[  FAILED  ] CueSettingVertical.NoKeyword
[  FAILED  ] CueSettingVertical.NoValue
[  FAILED  ] CueSettingVertical.NoDelimiter
[  FAILED  ] CueSettingVertical.UppercaseKeyword
[  FAILED  ] CueSettingVertical.UppercaseValue

[  FAILED  ] CueSettingAlign.Left
[  FAILED  ] CueSettingAlign.Right
[  FAILED  ] CueSettingAlign.BadKeyword
[  FAILED  ] CueSettingAlign.NoKeyword
[  FAILED  ] CueSettingAlign.BadValue
[  FAILED  ] CueSettingAlign.NoValue
[  FAILED  ] CueSettingAlign.BadDelimiter
[  FAILED  ] CueSettingAlign.NoDelimiter
[  FAILED  ] CueSettingAlign.UppercaseKeyword
[  FAILED  ] CueSettingAlign.UppercaseValue

[  FAILED  ] CueSettingLine.ManyDigitLowBoundary
[  FAILED  ] CueSettingLine.ManyDigitPercentage
[  FAILED  ] CueSettingLine.ManyDigitPercentageHighBoundary
[  FAILED  ] CueSettingLine.ManyDigitPercentageLowBoundary
[  FAILED  ] CueSettingLine.DoubleDigitPercentage
[  FAILED  ] CueSettingLine.DoubleDigitPercentageHighBoundary
[  FAILED  ] CueSettingLine.DoubleDigitPercentageLowBoundary
[  FAILED  ] CueSettingLine.SingleDigitPercentage
[  FAILED  ] CueSettingLine.SingleDigitPercentageHighBoundary
[  FAILED  ] CueSettingLine.SingleDigitPercentageLowBoundary
[  FAILED  ] CueSettingLine.BadKeyword
[  FAILED  ] CueSettingLine.NoKeyword
[  FAILED  ] CueSettingLine.NoDelimiter
[  FAILED  ] CueSettingLine.BadDelimiter
[  FAILED  ] CueSettingLine.BadValue
[  FAILED  ] CueSettingLine.NoValue
[  FAILED  ] CueSettingLine.BadValueSuffix
[  FAILED  ] CueSettingLine.WhitespaceDelimiter
[  FAILED  ] CueSettingLine.BadWhitespaceBeforeDelimiter
[  FAILED  ] CueSettingLine.BadWhitespaceAfterDelimiter
[  FAILED  ] CueSettingLine.UppercaseKeyword
[  FAILED  ] CueSettingLine.PercentNegative
[  FAILED  ] CueSettingLine.PercentOver100

[  FAILED  ] CueSetting.SameCueSetting
[  FAILED  ] CueSetting.BadDelimiter
[  FAILED  ] CueSetting.BadDelimiter2
[  FAILED  ] CueSetting.NoDelimiter
[  FAILED  ] CueSetting.DigitDelimiter

Tuesday, 27 November 2012

WebVTT Parser Should Used Parser Specifications

Do Not Use Syntax Specification For Parser

From what I've been reading in the unit test, it seems that the parser was written with the syntax rules in mind for the WebVTT specifications. It should not. It must be written to the parsing specifications. I've only manged to review cue setting generic tests and the align setting tests. I've had to change 10 of 25 tests all of which were throwing errors and shouldn't. I don't really mind that. 5 and 3 of the remaining tests are testing the same thing. Tests are good. What I mind is that I had to rewrite all the comments and documentation because it was all referencing the syntax rules and not the parsing rules.

If you don't want to read the parsing specifications, that's fine. I wrote it all out in pseudo code. Let me re-post it. Obviously much of it needs to be expanded to actually work, and most of that is fine, especially the string handling parts. But otherwise it should work exactly this way. I would expect a number of the current bugs could be solved just be making the parser conform to the parser specifications.

You may want to change the Bad Cue Loop. I put it were it seems to be right with the spec, but there may be a more optimal place for it. It should only run if the cue timings are malformed since that is the only place where an error can by thrown.

We are actually coding in C with the addition of a C++ wrapper.

I edited it a bit to make it look a bit better and fill in some parts.

Parser Pseudo-Code


// Represents a dynamically updating list
interface TextTrackCueList {
  readonly attribute unsigned long length;   // Number of cues
  getter TextTrackCue (unsigned long index);
  TextTrackCue? getCueById(DOMString id);    // By identifier
};

enum AutoKeyword { "auto" };

[Constructor(double startTime, double endTime, DOMString text)]
interface TextTrackCue : EventTarget {
  readonly attribute TextTrack? track;

           attribute DOMString id;              // Identifier
           attribute double startTime;
           attribute double endTime;
           attribute boolean pauseOnExit;
           attribute DOMString vertical;
           attribute boolean snapToLines;
           attribute (long or AutoKeyword) line;
           attribute long position;
           attribute long size;
           attribute DOMString align;
           attribute DOMString text;
  DocumentFragment getCueAsHTML();

           attribute EventHandler onenter;
           attribute EventHandler onexit;
};

interface Node {}

interface InternalNode : Node {
    OrderedList<Node> children;
    OrderedList<String> classNames;
}

interface LeafNode : Node {}

interface ClassNode : InternalNode {}

interface ItalicsNode : InternalNode {}

interface BoldNode : InternalNode {}

interface UnderlineNode : InternalNode {}

interface RubyNode : InternalNode {}

interface RubyTextNode : InternalNode {}

interface VoiceNode : InternalNode {
    attribute String voiceName
}

interface TextNode : LeafNode {
    attribute String text;
}

interface TimestampNode : LeafNode {
    attribute double timestampSeconds;
}

interface Token {}

interface StringToken : Token {
    attribute String value;

interface StartTagToken : Token {
    attribute String tagName;
    attribute OrderedList<String> classes; //Could be done like TextTrackCueList?
    attribute String annotation;

interface EndTagToken : Token {
    attribute String tagName;

interface TimestampTagToken extends Token {
    attribute double value;
}

Method parse (ByteStream byteStreamInput, OrderedList<texttrackcue> output)
   String input = convert asynchronous byteStreamInput to Unicode
   
   replace NULL characters with REPLACEMENT Ccharacters
   replace CARRIAGE RETURN LINE FEED (CRLF) character pairs with single LINE FEED
   replace CARRIAGE RETURN characters with LINE FEED characters
   
   Integer position = start of input
   
   If character as position is BYTE ORDER MARK
      advancePosition(input, position)
   End If
   
   String line
   Boolean alreadyCollectedLine = False
   
   line = collectLine(input, position)
   
   If line has less than 6 characters
      Throw Error
   End If
   
   If line has exactly 6 charaters and is not "WEBVTT"
      Throw Error
   End If
   
   If line has more than 6 characters
   and (first 6 characters not "WEBVTT" or (7th character not SPACE or TAB))
      Throw Error 
   End If
   
   If position is past end of input
      return
   End If
   
   If character as position in input is LINE FEED
      advancePosition(input, position)
   End If
   
   # Header
   Do
      line = collectLine(input, position)
      
      If position is past end of input
         return
      End If
      
      If character as position in input is LINE FEED
         advancePosition(input, position)
      End If
      
      If line contains "-->"
         alreadyCollectedLine = True
         Exit While Loop
      End If
   While line is empty
   
   # Cue Loop
   Loop
      If alreadyCollectedLine is False
         While character in input as position is LINE FEED
            advancePosition(input, position)
         End While
         
         line = collectLine(input, position)
         
         If line is empty
            Exit Loop
         End If
      End If

      TextTrackCue cue = new TextTrackCue()
      
      cue.identifier = empty
      cue.pauseOnExit = False
      cue.writingDirection = horizontal
      cue.snapToLines = True
      cue.linePosition = auto
      cue.textPosition = 50
      cue.size = 100
      cue.alignment = middleAlignment
      cue.text = empty
      
      If line does not contain "-->"
         cue.identifier = line
         
         If position is past end of input
            Exit Loop
         End If
         
         If character as position in input is LINE FEED
            advancePosition(input, position)
         End If
         
         line = collectLine(input, position)
         
         If line is empty
            Exit Loop
         End IF
      End If
      
      alreadyCollectedLine = False
      
      Try
         collectCueTimingsAndSettings(line, cue)
      Catch
         Boolean end = False
         
         # Bad cue loop
         Loop
            If position is past end of input
               end = true
               Exit Loop
            End If
            
            If character as position in input is LINE FEED
               advancePosition(input, position)
            End If
            
            line = collectLine(input, position)
            
            If line contains "-->"
               alreadyCollectedLine = True
               Exit Loop
            End If
            
            If line is empty
               Exit Loop
            End If
         End Loop
         
         If end is true
            Exit Loop
         End If
         
         Continue Loop
      End Try

      String cueText = empty
      
      # Cue text loop
      Loop
         If position is past end of input
            Exit Loop
         End If
         
         If character as position in input is LINE FEED
            advancePosition(input, position)
         End If
         
         line = collectLine(input, position)
         
         If line is empty
            Exit Loop
         End If
         
         If line contains "-->"
            alreadyCollectedLine = True
            Exit Loop
         End If
         
         If cueText is not empty
            cueText += LINE FEED
         End If
         
         cueText += line
      End Loop
      
      # Cue text processing
      cue.text = cueTextDomContruction(parseCueText(cueText))
      
      output append cue
   End Loop
End Method parse

Method advancePosition(String input, Integer position)
   If position is at the end of input and bystream has not ended
      Wait for bytestream to add characters to input
   End If
   
   If bytestream has ended and next position is past end if input
      position = past end of input
   Else
      position = location of next character sin input
   End IF
End Method advancePosition

Function String collectLine(String input, Integer position)
   String result = empty
   
   While position not past end of input and character in input at position not LINE FEED
      result += character in input at position
      advancePosition(input, position)
   End While
   
   return result
End Function collectLine

Method collectCueTimingsAndSettings(String input, TextTrackCue cue)
   String remainder
   Integer position
   
   position = start of input
   
   skipWhitespace(input, position)
   
   cue.startTime = collectTimestamp(input, position)
   
   skipWhitespace(input, position)
   
   If character at position in input is not "-"
      Throw Error
   Else
      position = location of next character in input
   End If
   
   If character at position in input is not "-"
      Throw Error
   Else
      position = location of next character in input
   End If
   
   If character at position in input is not ">"
      Throw Error
   Else
      position = location of next character in input
   End If
   
   skipWhitespace(input, position)
   
   cue.endTime = collectTimestamp(input, position)
   
   String remainder = remainder of input starting at position
   
   parseSettings(remainder, TextTrackCue cue)
End Method

# Defined in http://dev.w3.org/html5/spec/common-microsyntaxes.html#common-parser-idioms
Method skipWhitespace(String input, Integer position)
   While character in input at position is SPACE or TAB OR LINE FEED or FORM FEED or CARRIAGE RETURN
      position = location of next character in input
   End While
End Method

Method parseSettings(String input, TextTrackCue cue)
   OrderedList<String> settings = input split on SPACE and TAB
   
   For Each String setting in settings
      If setting does not contain ":" or first or last character in setting is ":"
         Next setting
      End If
      
      String name = substring of setting between start or setting and first ":"
      
      String value = substring of setting between first ":" and end of setting
      
      Switch (name)
         Case "vertical"
            If value is "rl"
               cue.writingDirection = verticalGrowingLeft
            End If
            
            If value is "lr"
               cue.writingDirection = verticalGrowingRight
            End If
            
            Break
         
         Case "line"
            If value conatains characters other than "-", "%", or "0" through "9"
               Break
            End If
            
            If value does not contaion at least on character between "0" through "9"
               Break
            End If
            
            If any character in value other than the first is "-"
               Break
            End If
            
            If any character in value other than the last is "%"
               Break
            End If
            
            Integer number = parse substring of value excluding trailing "%" as a signed integer
            
            If last character in value is "%" and (number < 0 or number > 100)
               Break
            End If
            
            cue.linePosition = number
            
            If last character in value is "%"
               cue.snapToLines = True
            End If
            
            Break
            
         Case "position"
            If value conatains characters other than "%" or "0" through "9"
               Break
            End If
            
            If value does not contaion at least on character between "0" through "9"
               Break
            End If
            
            If any character in value other than the last is "%"
               Break
            End If
            
            If last character in value is not "%"
               Break
            End If
            
            Integer number = parse substring of value excluding trailing "%" as a signed integer
            
            If number < 0 or number > 100
               Break
            End If
            
            cue.textPosition = number
            
            Break
            
         Case "size"
            If value conatains characters other than "%" or "0" through "9"
               Break
            End If
            
            If value does not contaion at least on character between "0" through "9"
               Break
            End If
            
            If any character in value other than the last is "%"
               Break
            End If
            
            If last character in value is not "%"
               Break
            End If
            
            Integer number = parse substring of value excluding trailing "%" as a signed integer
            
            If number < 0 or number > 100
               Break
            End If
            
            cue.size = number
            
            Break
            
         Case "align"
            If value is "start"
               cue.alignment = startAlignment
            End If
            
            If value is "middle"
               cue.alignment = middleAlignment
            End If
            
            If value is "end"
               cue.alignment = endAlignment
            End If
            
            If value is "left"
               cue.alignment = leftAlignment
            End If
            
            If value is "right"
               cue.alignment = rightAlignment
            End If
            
            Break
      End Switch (name)
   End For Each setting
End Method parseSettings

Function Float collectTimestamp(String input, Integer position)
   Enumerable SignificantUnits
      Minutes
      Hours
   End Enumberable
   
   Integer value1, value2, value3, value4
   String string
   SignificantUnits mostSignificantUnits
   
   mostSignificantUnits = Minutes
   
   If position is past end of input
      Throw Error
   End If
   
   If character as position in input is not "0" through "9"
      Throw Error
   End If
   
   string = collectDigits(input, position)
   
   value1 = parse string to integer
   
   If string not exactly two characters or value > 59 then
      mostSignificantUnits = Hours
   End If
   
   If position is past end of input or character in input at position is not ":"
      Throw Error
   Else
      position = location of next character in input
   End If
   
   string = collectDigits(input, position)
   
   If string not exactly two characters
      Throw Error
   End If
   
   value2 = parse string to integer
   
   If mostSignificantUnits = Hours
   or (position not past end of input and character as position in input is ":")
      If position is past end of input or character in input at position is not ":"
         Throw Error
      Else
         position = location of next character in input
      End If
      
      string = collectDigits(input, position)
      
      If string not exactly two characters
         Throw Error
      End If
      
      value3 = parse string to integer
   Else
      value3 = value2
      value2 = value1
      value1 = 0
   End If
   
   If position is past end of input or character in input at position is not "."
      Throw Error
   Else
      position = location of next character in input
   End If
   
   string = collectDigits(input, position)
   
   If string not exactly three characters
      Throw Error
   End If
   
   value4 = parse string to integer
   
   If value2 > 59 or value3 > 59
      Throw Error
   End If
   
   return value1 * 60 * 60 + value2 * 60 + value3 + value4 / 1000
End Function collectTimestamp

Function String collectDigits(String input, Integer position)
   String result = empty
   
   While position not past end of input and character in input at position is "0" through "9"
      result += character in input at position
      position = location of next character in input
   End While
   
   return result
End Function collectLine

Function OrderedList<Node> parseCueText (String input)
   Integer position = start of input
   OrderedList<Node> result = empty
   InternalNode current = new InternalNode

   Loop
      If position is past End of input
         return new StringToken(result)
      End If
      
      Token token = cueTextTokenizer(input, position)
      
      Switch (typeof(token))
         Case StringToken
            current.children append new TextNode(text: token.value)
            Break
            
         Case StartTagToken
            Switch (token.tagName)
               Case "c"
                  ClassNode node = new ClassNode()
                  appendClassesToNode(node, token)
                  current.children append node
                  current = node
                  Break
               
               Case "i"
                  ItalicsNode node = new ItalicsNode()
                  appendClassesToNode(node, token)
                  current.children append node
                  current = node
                  Break
                  
               Case "b"
                  BoldNode node = new BoldNode()
                  appendClassesToNode(node, token)
                  current.children append node
                  current = node
                  Break
                  
               Case "u"
                  UnderlineNode node = new UnderlineNode()
                  appendClassesToNode(node, token)
                  current.children append node
                  current = node
                  Break
                  
               Case "ruby"
                  RubyNode node = new RubyNode()
                  appendClassesToNode(node, token)
                  current.children append node
                  current = node
                  Break
                  
               Case "rt"
                  If typeof(current) is RubyNode
                     RubyTextNode node = new RubyTextNode()
                     appendClassesToNode(node, token)
                     current.children append node
                     current = node
                  End If
                  Break
               
               Case "v"
                  VoiceNode node = new VoiceNode()
                  appendClassesToNode(node, token)
                  
                  If token.annotation is not null
                     node.annotation = token.annotation
                  else
                     node.annotation = empty
                  End If
                  
                  current.children append node
                  current = node
                  Break
            End Switch (token.tagName)
            
            Break
            
         Case EndTagToken
            If (token.tagName is "c" And typeof(current) is ClassNode)
            Or (token.tagName is "i" And typeof(current) is ItalicsNode)
            Or (token.tagName is "b" And typeof(current) is BoldNode)
            Or (token.tagName is "u" And typeof(current) is UnderlineNode)
            Or (token.tagName is "ruby" And typeof(current) is RubyNode)
            Or (token.tagName is "rt" And typeof(current) is RubyTextNode)
            Or (token.tagName is "v" And typeof(current) is VoiceNode)
               current = parent of current
            else If token.tagName is "ruby" And typeof(current) is RubyTextNode
               current = parent of parent of current
            End If
            
            Break
         
         Case TimestampTagToken
            
      End Switch (token)
   End loop
End Function ParserMain

Method appendClassesToNode(InternalNode node, Token token)
   for each className in token.classes
      If className not empty
         node.classes append className
      End If
   End for
End Method appendClassesToNode

Function Token cueTextTokenizer(String input, Integer position)
   Enumerable TokenizerStates
      dataState
      escapeState
      tagState
      startTagState
      startTagClassState
      startTagAnnotationState
      EndTagState
      timestampTagState
   End Enumerable
   
   TokenizerStates tokenizerState = dataState
   String result = empty
   String buffer = empty
   OrderedList<String> classes = empty
   Character c
   
   loop
      If position is past End of input
         c = End of file marker
      else
         c = character in input indiciated by position
      End If
      
      Switch (tokenizerState)
         Case dataState
            Switch (c)
               Case "&"
                  buffer = c
                  tokenizerState = escapeState
                  Break
               
               Case "<"
                  If result is empty
                     tokenizerState = tagState
                  else
                     return new StringToken(result)
                  End If
                  Break
               
               Case End-OF-FILE MARKER
                  return new StringToken(result)
                  Break
               
               default
                  result += c
            End Switch (c)
            
            Break
         
         Case escapeState
            Switch (c)
               Case "&"
                  result += buffer
                  buffer = c
                  Break
               
               Case "0" to "9"
               Case "a" to "z"
               Case "A" to "Z"
                  buffer += c
                  Break
               
               Case ";"
                  Switch (buffer)
                     Case "&amp"
                        result += "&"
                        Break
                     
                     Case "&lt"
                        result += "<"
                        Break
                     
                     Case "&gt"
                        result += ">"
                        Break
                     
                     Case "&lrm"
                        result += LEFT-TO-RIGHT MARK
                        Break
                     
                     Case "&rlm"
                        result += RIGHT-TO-LEFT MARK
                        Break
                     
                     Case "&nbsp"
                        result += NO-Break SPACE
                        Break
                     
                     default
                        result += buffer + ";"
                  End Switch (buffer)
                  
                  tokenizerState = dataState
                  Break
               
               Case "<"
               Case End-OF-FILE MARKER
                  result += buffer
                  return new StringToken(value: result)
                  Break
               
               default
                  result += buffer
                  result += c
                  tokenizerState = dataState
            End Switch (c)
            
            Break
         
         Case tagState
            Switch (c)
               Case TAB
               Case LINE FEED
               Case FROM FEED
               Case SPACE
                  tokenizerState = startTagAnnotationState
                  Break
                  
               Case "."
                  tokenizerState = startTagClassState
                  Break
               
               Case "/"
                  tokenizerState = EndTagState
                  Break
               
               Case "0" to "9"
                  result = c
                  tokenizerState = timestampTagState
                  Break
                  
               Case ">"
                  position = location of next character in input
               
               Case End-OF-FILE MARKER
                  return new StartTagToken(tagName: empty)
                  Break
               
               default
                  result = c
                  tokenizerState = startTagState
            End Switch (c)
            
            Break
            
         Case startTagState
            Switch (c)
               Case TAB
               Case LINE FEED
               Case SPACE
                  tokenizerState = startTagAnnotationState
                  Break
                  
               Case FROM FEED
                  buffer = c
                  tokenizerState = startTagAnnotationState
                  Break
                  
               Case "."
                  tokenizerState = startTagClassState
                  Break
                  
               Case ">"
                  position = location of next character in input
               
               Case End-OF-FILE MARKER
                  return new StartTagToken(tagName: result)
                  Break
               
               default
                  result += c
            End Switch (c)
            
            Break
         
         Case startTagClassState
            Switch (c)
               Case TAB
               Case LINE FEED
               Case SPACE
                  classes append buffer
                  buffer = empty
                  tokenizerState = startTagAnnotationState
                  Break
                  
               Case FROM FEED
                  classes append buffer
                  buffer = c
                  tokenizerState = startTagAnnotationState
                  Break
                  
               Case "."
                  classes append buffer
                  buffer = empty
                  Break
                  
               Case ">"
                  position = location of next character in input
               
               Case End-OF-FILE MARKER
                  classes append buffer
                  return new StartTagToken(tagName:result, classes: classes)
                  Break
               
               default
                  buffer += c
            End Switch (c)
            
            Break
         
         Case startTagAnnotationState
            Switch (c)               
               Case ">"
                  position = location of next character in input
               
               Case End-OF-FILE MARKER
                  remove leading and trailing space characters from buffer
                  replace sequences of one or more consecutive space characters with a single SPACE
                  return new StartTagToken(tageName: result, classes: classes, annotation: buffer)
                  Break
               
               default
                  buffer += c
            End Switch (c)
            
            Break
         
         Case EndTagState
            Switch (c)
               Case ">"
                  position = location of next character in input
               
               Case End-OF-FILE MARKER
                  return new EndTagToken(tagName: result)
                  Break
               
               default
                  result += c
            End Switch (c)
            
            Break
         
         Case timestampTagState
            Switch (c)
               Case ">"
                  position = location of next character in input
               
               Case End-OF-FILE MARKER
                  return new TimestampTagToken(tagName: result)
                  Break
               
               default
                  result += c
            End Switch (c)
            
            Break   
      End Switch (tokenizerState)
      
      position = location of next character in input
   End loop
End Function cueTextTokenizer

Function Tree cueTextDomContruction(OrderedList<Node> nodes)
   # Unsure how to do this at this post
   # Refer to http://dev.w3.org/html5/webvtt/#webvtt-cue-text-dom-construction-rules
End Function cueTextDomContruction

Sunday, 25 November 2012

Mozilla Developer Network Documentation

I've added a page on WebVTT to the Mozilla Developer Network (MDN). I've also added more information to the track page. Basically I transferred by blog posts on the syntax to MDN.  It can also be reached by going to the HTML5 portal on MDN. I've added a section for track and WebVTT to the multimedia section.

IMPORTANT: From now on please refer to the MDN documentationI will no longer update my previous posts on the subject.

I have added a section on on WebVTT comments. Don't worry about the parser. The parsing of cues discards cues with out the --> string. Notes are treated as cues but because they do no have the string --> they are discarded. Thus no additional work needs to be done to implement then, although the validation program may need to be altered.

In addition I've fixed a number of problems with my original post, mainly grammar and clarity.

https://developer.mozilla.org/en-US/docs/HTML/WebVTT
https://developer.mozilla.org/en-US/docs/HTML/HTML5
https://developer.mozilla.org/en-US/docs/HTML/HTML_Elements/track

Friday, 16 November 2012

Unlicense

Over the last few weeks I've done a lot of work making some programs for my parallel programming course. I'll get to those in another post. One problem I ran into was how to license it since I have everything on Github. Now I really don't care if anyway uses something I did, nor do I care how they do it. I just want to be sure I'm not liable for anything. Basically I just want to release to public domain. MIT and BSD licenses are good, way better than GNU, but there is still have to make sure the license is copied and the author attributed properly. Frankly I find that last part a bit vain. If you're going to make something free for anyone to use in anyway, then just do it.

But you cannot just release to public domain. Although it depends on jurisdiction, many places do not let you make something public domain. The only way to get something in public domain is when the copyright expires. Copyright is automatically given to the author when they make something. However you can make the license be effectively public domain. There are currently two license that do this. The first is the Creative Commons CC0 license. Unfortunately is doesn't seem the best suited to code, and worse it is really long. The sheer length of it would make me worried about using the code because I just wouldn't really understand it all. That is why the MIT and BSD licenses were so good. The I found the other one, the Unlicense.

Why Unlicense? Here is what they says.
Because you have more important things to do than enriching lawyers or imposing petty restrictions on users of your code. How often have you passed up on utilizing and contributing to a great software library just because its open source license was not compatible with your own preferred flavor of open source? How many precious hours of your life have you spent deliberating how to license your software or worrying about licensing compatibility with other software? You will never get those hours back, but here's your chance to start cutting your losses. Life's too short, let's get back to coding.

Source: http://unlicense.org/
 And here it is.

This is free and unencumbered software released into the public domain.

Anyone is free to copy, modify, publish, use, compile, sell, or distribute this software, either in source code form or as a compiled binary, for any purpose, commercial or non-commercial, and by any means.

In jurisdictions that recognize copyright laws, the author or authors of this software dedicate any and all copyright interest in the software to the public domain. We make this dedication for the benefit of the public at large and to the detriment of our heirs and successors. We intend this dedication to be an overt act of relinquishment in perpetuity of all present and future rights to this software under copyright law.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

For more information, please refer to <http://unlicense.org/>
No liability and it is simple. However there is one gotcha. To protect yourself from accepting contributions which would automatically be copyrighted to the author of that contribution, every contributor should include a simple waiver with each contribution, or at least major contribution. If you want you can go further by requiring a digital signature. There are more details at http://unlicense.org

Of course this also means you can only use public domain (or equivalent) sources for code, but there is actually a lot of it out there.

Thursday, 15 November 2012

WebVTT Syntax Test Review

I'm going over all the syntax tests for WebVTT. These tests can be found in the following two locations.
Main repo: https://github.com/humphd/webvtt/tree/seneca/test/spec
My fork: https://github.com/KyleBarnhart/webvtt/tree/seneca/test/spec

I am not judging if the tests actually pass or not, but rather should they pass. I am currently assuming that tests in the directory "good" and "known-bad" pass, and that tests in "bad" and "known-good" fail. I will modify the files to keep the intent. If I cannot do that I will move. them around thus knowing. For example a test in "good" that should not pass will be moved to "known-bad". Also I will only list changes.

I've made a change to the WebVTT syntax post. Cue start times may be equal to previous start times, but not less than.

I will submit a pull request after I have tested the changes,

Good

  • tc_1001_WEBVTT-no-BOM.test
    • Added one more line terminator. Two or more line terminators are require after WEBVTT.
  • tc_1002_WEBVTT-with-BOM.test
    • Added two line terminator. Two or more line terminators are require after WEBVTT.
  • tc_1003_WEBVTT-head.test
    • Added two line terminator. Two or more line terminators are require after WEBVTT.
  • tc_1004_WEBVTT-space.test
    • Added one more line terminator. Two or more line terminators are require after WEBVTT.
  • tc_1005_WEBVTT-tab.test
    • Added two line terminator. Two or more line terminators are require after WEBVTT.
  • tc_1007_space-text.test
    • Added one more line terminator. Two or more line terminators are require after WEBVTT.
  • tc_1008_tab-Text.test
    • Added one more line terminator. Two or more line terminators are require after WEBVTT.

Known-Good

  • tc000-sample_test.test and tc5300-embedded_voice.test are identical
    • remove tc000-sample_test.test since the other fits better with the naming scheme

Bad

  • tc3100-cuetimings_separator_left_bad.test
    • WEBVTT string is malformed so not checking timestamp 
    • changed "WWEBVTT" to "WEBVTT"
  • tc3102-cuetimings_separator_left_bad.test
    • WEBVTT string is malformed so not checking timestamp 
    • changed "WWEBVTT" to "WEBVTT"
  • tc3103-cuetimings_separator_leftright_bad.test
    • WEBVTT string is malformed so not checking timestamp 
    • changed "WWEBVTT" to "WEBVTT"
  • tc3104-cuetimings_separator_malformed_bad.test
    • WEBVTT string is malformed so not checking timestamp 
    • changed "WWEBVTT" to "WEBVTT"
  • tc3105-cuetimings_separator_missing_bad.test
    • WEBVTT string is malformed so not checking timestamp 
    • changed "WWEBVTT" to "WEBVTT"
  • tc3106-cuetimings_separator_right_bad.test
    • WEBVTT string is malformed so not checking timestamp 
    • changed "WWEBVTT" to "WEBVTT"

Known-Bad

  • Nothing wrong

Mozilla and the Element

I want to explore how Mozilla is currently dealing with the <track> element for HTML5.

According to the Mozilla documentation it is currently not supported. According to the Mozilla bug tracker, we at Seneca College are working on it. There is nothing in the documentation for WebVTT.

The <video> tag is supported. Here is the bug tracker for it.

The DOM media element is here in the documentation.

So now I'm looking at the source code. There is a nice post for the directory structure.

This looks like what deals with the media element.
https://github.com/mozilla/mozilla-central/tree/master/dom/media
https://github.com/mozilla/mozilla-central/blob/master/content/html/content/public/nsHTMLMediaElement.h
https://github.com/mozilla/mozilla-central/blob/master/dom/interfaces/html/nsIDOMHTMLMediaElement.idl

Video element:
https://github.com/mozilla/mozilla-central/blob/master/content/html/content/src/nsHTMLVideoElement.cpp

Anyway, it will take a lot of time to figure out where WebVTT and <track> will plug in.

Wednesday, 14 November 2012

HTML5 Text Track Model

The purpose of WebVTT files in HTML is to create a list of Text Track Cues for the HTML5 Text Track object, which is in turn part of a media element. It is the responsibility of the parser to return a list of text track cues to the text track along with the rendering rules for it. That means our WebVTT parser will be called upon to return those two things. There are extensive rules for how the browser should deal with tracks and how they behave. I will not cover them here. This post is for describing the what the objects look like.

I wrote before how the HTML5 <track> object is composed. Now I will explain the how the text track model is composed in the DOM.


Text Track Model

There is a text track list in the media element and they can come from three sources. They are added in the following order:
  1. Tracks listed with the <track> tag in the order specified in HTML.
  2. Tracks added dynamically with the addTextTrack() method in the order they are added.
  3. Tracks that are embedded in a media object.
Tracks can be enabled or loaded automatically depending on many setting but most importantly on user preferences. The rules for track selection and fetching are not relevant in describing the model and they can be found in the specifications.

The following are components of a text track.

kind

A string which represents how the text track is handled by the browser. It can change and is set by the <track> tag. If the tag's values is changed, so must this value. It must be one of the following:
  1. subtitles
  2. captions
  3. descriptions
  4. chapters
  5. metadata

label

A string the identifies the track for users. It can change as is set by the <track> tag. If the tag's values is changed, so must this value. If it is empty the browser should generate a label based on other properties such as kind and language.

in-band metadata track dispatch type

The in-band metadata track dispatch type of the track is embedded in the media object and the kind is metadata then this is a string used to get scripts to work with the track, otherwise it is an empty string. The rules for getting this vary with the type of media (see here).

language

A string which indicates the language of the track. It is a BCP 47 language tag. It can change as is set by the <track> tag. If the tag's values is changed, so must this value.

readiness state

This is not actually in this model but in the <track> tag model (see pseudocode at bottom of this post). The loading status of the track. Initially it is set to NONE for not loaded. A number which indicates one of the following states:
  1. NONE
    • Value is 0
    • The track has not obtained any cues.
  2. LOADING
    • Value is 1
    • The track is loading has not hit any errors. The cues are still loading.
  3. LOADED
    • Value is 2
    • The track has loaded and there were no errors. All cues are loaded.
  4. ERROR
    • Value is 3
    • The track hit one or more errors. Cues may not have been loaded.

mode

The active state of the track. Initially set to disabled. It is one of three values:

  1. disabled
    • Track is not active and is ignored by the browser.
  2. hidden
    • The track is active but the cues are not being rendered.
  3. showing
    • The track is active and the cues are being rendered as appropriate for the track's kind.

list of cues

A list of text track cues. This list is dynamic since the cues are parsed asynchronously. It is initially empty. It is also has the rules for rendering the text track, which for WebVTT is found in the WebVTT specification.


Text Track Cue Model

The WebVTT parser returns a list of text track cues which is added to the text track. There are a A text track cue has the following format.

identifier

An arbitrary string. It is initially an empty string.

start time

The start time of the cue in seconds and fractions of a second.

end time

The end time of the cue in seconds and fractions of a second.

pause-on-exit flag

A true of false value. If true the media element will pause at the end of the current cue. It is initially false.

writing direction

A string representing if the writing is to be displayed horizontally or vertically. It is initially horizontal. There are three possible values:
  1. Horizontal
    • Value is an empty string
    • lines are horizontal
    • consecutive lines are displayed below each other
    • line position is relative to height
    • text position and size are relative to width
  2. Vertical growing left
    • Value is string rl
    • lines are vertical
    • consecutive lines are displayed to the left of each other
    • line position is relative to width
    • text position and size are relative to height
  3. Vertical growing right
    • Value is string lr
    • lines are vertical
    • consecutive lines are displayed to the right of each other
    • line position is relative to width
    • text position and size are relative to height

snap-to-lines flag

A true or false value. If true line position indicates a position like a line of text in a document. If false line position is a percentage. It is initially true.

line position

An integer representing the position where the text is to be displayed. The direction is indicated by writing direction. The snap-to-lines flag indicates it is either a percentage or a position like line number on a document. It can also be set to the string auto which means it is determined based on other active cues. If it is a percentage, then it must be between 0 and 100 (inclusive). It is initially auto.

text position

An integer percentage between 0 and 100 (inclusive) that represents the position where the text is to be displayed. The direction is indicated by writing direction. It is initially 50.

size

An integer percentage between 0 and 100 (inclusive) that represents the width (or height) of the text display area. The direction is indicated by writing direction.

Example 1: Let writing direction be horizontal, then size is the width of the Caption Rendering Box.
Source: http://www.w3.org/community/texttracks/wiki/Caption_Model#5._Caption_Rendering

alignment

A string the indicates how text is aligned within the rendering area (Caption Rendering Box in Example 1). The start and end side depend on the writing direction. It is initially middle. There are five possible values:

  1. start
    • Text is aligned to the start side.
  2. middle
    • Text is centred.
  3. end
    • Text is aligned to the end side.
  4. left
    • Text is aligned to the left.
  5. right
    • Text is aligned to the right.

text

The actual text of the cue. In addition is associated with the rules for how it is to be interpreted. The rules for interpretation are the WebVTT parsing rules, WebVTT cue text rendering rules, and WebVTT DOM construction rules.

active flag

A true or false value. It is used to make sure the cue is rendered properly. It's behavior is dynamic and complex. Please see the specifications for more information.

display state

It is used for rendering. Used in conjunction with active flag. It's behavior is dynamic and complex. Please see the specifications for more information.

Additional Text Track Cue Information

There are a number of methods that the different objects require but I want to mention only one since it relates to how WebVTT files get parsed.

getCueAsHTML()

This is a method that the text track cue has that returns a document fragment (which is a small document object or piece of the DOM) by converting the cue text by the WebVTT parsing rules  and WebVTT DOM construction rules.


Psudeocode

W3C HTML5 Specification [1] [2] [3] [4]

interface HTMLTrackElement : HTMLElement {
           attribute DOMString kind;
           attribute DOMString src;
           attribute DOMString srclang;
           attribute DOMString label;
           attribute boolean default;

  const unsigned short NONE = 0;
  const unsigned short LOADING = 1;
  const unsigned short LOADED = 2;
  const unsigned short ERROR = 3;
  readonly attribute unsigned short readyState;

  readonly attribute TextTrack track;
};

enum TextTrackMode { "disabled", "hidden", "showing" };
interface TextTrack : EventTarget {
  readonly attribute DOMString kind;
  readonly attribute DOMString label;
  readonly attribute DOMString language;
  readonly attribute DOMString inBandMetadataTrackDispatchType;

           attribute TextTrackMode mode;

  readonly attribute TextTrackCueList? cues;
  readonly attribute TextTrackCueList? activeCues;

  void addCue(TextTrackCue cue);
  void removeCue(TextTrackCue cue);

           attribute EventHandler oncuechange;
};

// Represents a dynamically updating list
interface TextTrackCueList {
  readonly attribute unsigned long length;   // Number of cues
  getter TextTrackCue (unsigned long index);
  TextTrackCue? getCueById(DOMString id);    // By identifier
};

enum AutoKeyword { "auto" };
[Constructor(double startTime, double endTime, DOMString text)]
interface TextTrackCue : EventTarget {
  readonly attribute TextTrack? track;

           attribute DOMString id;              // Identifier
           attribute double startTime;
           attribute double endTime;
           attribute boolean pauseOnExit;
           attribute DOMString vertical;
           attribute boolean snapToLines;
           attribute (long or AutoKeyword) line;
           attribute long position;
           attribute long size;
           attribute DOMString align;
           attribute DOMString text;
  DocumentFragment getCueAsHTML();

           attribute EventHandler onenter;
           attribute EventHandler onexit;
};