Transition Parsing
The TransitionParsing
submodule provides implementations of transition-based parsing algorighms.
julia> using DependencyTrees.TransitionParsing
In transition-based dependency parsing, trees are built incrementally and greedily, one relation at a time. Transition systems define an intermediate parser state (or configuration), and oracles map confifurations to "gold" transitions.
The DependencyTrees.TransitionParsing
module implements the following transition systems:
Arc-Standard
DependencyTrees.TransitionParsing.ArcStandard
— TypeArcStandard()
Transition system for for Arc-Standard dependency parsing.
Transitions
Transition | Definition |
---|---|
LeftArc(l) | (σ|s1|s0, β, A) → (σ|s0, β, A ∪ (s0, l, s1)) |
RightArc(l) | (σ|s, b|β, A) → (σ, b|β, A ∪ (b, l, s)) |
Shift | (σ, b|β, A) → (σ|b, β, A) |
Preconditions
Transition | Condition |
---|---|
LeftArc(l) | ¬[s1 = 0], ¬∃k∃l'[(k, l', s1) ϵ A] |
RightArc(l) | ¬∃k∃l'[(k, l', s0) ϵ A] |
See Nivre 2004.
Oracle function for arc-standard parsing:
DependencyTrees.TransitionParsing.static_oracle
— Functionstatic_oracle(cfg, gold_tree)
Static oracle function for arc-standard dependency parsing.
Arc-Eager
DependencyTrees.TransitionParsing.ArcEager
— TypeArcEager()
Arc-Eager transition system for dependency parsing.
Transitions
Transition | Definition |
---|---|
LeftArc(l) | (σ|s, b|β, A) → (σ, b|β, A ∪ (b, l, s)) |
RightArc(l) | (σ|s, b|β, A) → (σ, b|β, A ∪ (b, l, s)) |
Reduce | (σ|s, β, A) → (σ, β, A) |
Shift | (σ, b|β, A) → (σ|b, β, A) |
Preconditions
Transition | Condition |
---|---|
LeftArc(l) | ¬[s = 0], ¬∃k∃l'[(k, l', i) ϵ A] |
RightArc(l) | ¬∃k∃l'[(k, l', j) ϵ A] |
Reduce | ∃k∃l[(k, l, i) ϵ A] |
References
Oracle functions for arc-eager parsing:
DependencyTrees.TransitionParsing.static_oracle
— Functionstatic_oracle(cfg::ArcEagerConfig, gold, arc=untyped)
Default static oracle function for arc-eager dependency parsing.
See Goldberg & Nivre 2012. (Also called Arc-Eager-Reduce in Qi & Manning 2017).
DependencyTrees.TransitionParsing.static_oracle_prefer_shift
— Functionstatic_oracle_prefer_shift(cfg::ArcEagerConfig, tree, arc=untyped)
Static oracle for arc-eager dependency parsing. Similar to the "regular" static oracle, but always Shift when ambiguity is present.
See Qi & Manning 2017.
DependencyTrees.TransitionParsing.dynamic_oracle
— Functiondynamic_oracle(cfg::ArgEagerConfig, tree, arc=untyped)
Dynamic oracle function for arc-eager parsing.
For details, see Goldberg & Nivre 2012.
Arc-Hybrid
DependencyTrees.TransitionParsing.ArcHybrid
— TypeArcHybrid()
Arc-Hybrid system for transition dependency parsing.
Described in Kuhlmann et al, 2011, Goldberg & Nivre, 2013.
Oracle functions for arc-hybrid parsing:
DependencyTrees.TransitionParsing.static_oracle
— Functionstatic_oracle(cfg::ArcHybridConfig, tree, arc=untyped)
Static oracle for arc-hybrid dependency parsing.
Return a gold transition (one of LeftArc, RightArc, or Shift) for parser configuration cfg
.
See Kuhlmann et al, 2011?
DependencyTrees.TransitionParsing.dynamic_oracle
— Methoddynamic_oracle(cfg::ArgHybridConfig, tree, arc)
Dynamic oracle function for arc-hybrid parsing.
For details, see Goldberg & Nivre, 2013.
Arc-Swift
DependencyTrees.TransitionParsing.ArcSwift
— TypeOracle function for arc-swift parsing:
DependencyTrees.TransitionParsing.static_oracle
— Functionstatic_oracle(cfg::ArcSwiftConfig, tree, arc)
Oracle function for arc-swift dependency parsing.
Described in Qi & Manning 2017.
List-Based Non-Projective
DependencyTrees.TransitionParsing.ListBasedNonProjective
— TypeListBasedNonProjective()
Transition system for list-based non-projective dependency parsing.
Described in Nivre 2008, "Algorithms for Deterministic Incremental Dependency Parsing."
Oracle function for list-based non-projective parsing:
DependencyTrees.TransitionParsing.static_oracle
— Functionstatic_oracle(::ListBasedNonProjectiveConfig, tree)
Return a training oracle function which returns gold transition operations from a parser configuration with reference to graph
.
<!– ## Misc. –>
<!– For stack-and-buffer transition systems (Arc-Eager, Arc-Standard, Arc-Hybrid, and Arc-Swift), DependencyTrees.jl implements functions for getting the tokens from the stack and buffer in a safe way: –>
<!– @docs --> <!-- stacktoken --> <!-- buffertoken --> <!--
–>
Oracles
An Oracle
maps a parser configuration to one more gold transitions, which can be used to train a dependency parser.
DependencyTrees.TransitionParsing.Oracle
— TypeOracle(system, oracle_function; label=untyped)
Create an oracle for predicting gold transitions in dependency parsing.
system
is a transition system, defining configurations and valid transitions.
oracle_function
is called on a paraser configuration and tree for gold predictions:
oracle(cfg, tree, label)
label
is a function that's called on the gold tokens for that parameters of arcs.
An oracle acts like a function when called on a DependencyTree
, returning either an OracleSequence
or an UnparsableTree
in the case when a tree cannot be parsed.
DependencyTrees.TransitionParsing.OracleSequence
— TypeOracleSequence(oracle, tree, policy=NeverExplore())
A "gold" sequence of parser configurations and transitions to build tree
.
The sequence of transitions is performed according to
policy
is a function that determines whether or not "incorrect" transitions are explored. It will be called like so: `policy(
DependencyTrees.TransitionParsing.UnparsableTree
— TypeUnparsableTree
A dependency tree that an oracle cannot parse.
An Oracle
's third argument is a function called on a DependencyTrees.Token
to parametrize a transition. This parameterization can be arbitrary, but DependencyTrees.TransitionParsing
exports two function which yield labeled or unlabeled dependencies, respectively:
DependencyTrees.TransitionParsing.untyped
— Functionuntyped(token)
Create an arc without a dependency label.
DependencyTrees.TransitionParsing.typed
— Functiontyped(token)
Create an arc with a labeled dependency relation.
Exploration Policies
Training a parser on only optimal sequeces of transitions can result in poor decisions under suboptimal conditions (i.e., after a mistake has been made). To compensate for this, OracleSequence
s can be created with exploration policies to control when (if ever) a "wrong" transition that prevents the correct parse from being produced is allowed.
Exploration policies can wrap a model
, which will be called to predict a transition (called like model(cfg, A, G)
where cfg
is the parser configuration, A
is a vector of possible transitions, and G
is the gold
transition(s) according to the oracle function). The exploration policy then selects the next transition from A
and G
and the prediction, if available.
DependencyTrees.TransitionParsing.AlwaysExplore
— TypeAlwaysExplore()
Policy for always exploring sub-optimal transitions.
If model
predicts a legal transition, apply it. Otherwise, sample from the possible transitions (without regard to the oracle transitions) according to rng
.
DependencyTrees.TransitionParsing.NeverExplore
— TypeNeverExplore()
Policy for never exploring sub-optimal transitions.
If model
predicts a gold transition, apply it. Otherwise, choose from the gold transitions according to rng
.
DependencyTrees.TransitionParsing.ExplorationPolicy
— TypeExplorationPolicy(k, p)
Simple exploration policy from Goldberg & Nivre, 2012. Explores at rate p
.
With rate p
, follow model
's prediction if legal, or choose from the possible transitions according to rng
if the prediction can't be followed. With probability 1 -p
, choose from gold transitions according to rng
.