Graph Parsing
The GraphParsing submodule provides implementations of graph-based parsing algorithms.
julia> using DependencyTrees.GraphParsingIn graph-based dependency parsing, trees are built all at once, predicting a globally best parse from a set of possible ones. A set of possible parses for a sentence is represented by a square matrix of arc weights: a DependencyGraph.
An appealing feature of graph parsing is that jointly predicting the whole tree structure avoids the parser making a "wrong turn" the way that transition parsers can. The drawback of this is the sacrifice of speed: graph parsing methods are slower than transition parsing methods (𝒪(n^2) for the Chu-Liu/Edmonds algorithm and 𝒪(n^3) for the Eisner algorithm, where n is the number of tokens in a sentence).
The DependencyTrees.GraphParsing module implements the following algorithms:
- Chu-Liu/Edmonds algorithm for decoding a maximum spanning tree (MST)
- Eisner's algorithm for decoding a projective tree
Dependency Graphs
DependencyTrees.GraphParsing.DependencyGraph — Type
DependencyGraphA square matrix of arc weights, reprenting a (set of) dependency parse trees.
Root arc scores are kept along the diagonal.
DependencyTrees.GraphParsing.DependencyGraph — Method
DependencyGraph(tree::DependencyTree)Create a dependency graph from tree.
Arcs will have a score of 1.0, and non-arcs 0.0.
julia> using DependencyTrees, DependencyTrees.GraphParsing
julia> tree = DependencyTree([(0, "book"), (3, "that"), (1, "flight")])
┌──────── 0 ROOT
└─►┌───── 1 book
│ ┌─► 2 that
└─►└── 3 flight
julia> graph = DependencyGraph(tree)
3×3 DependencyGraph{Float64}:
1.0 0.0 1.0
0.0 0.0 0.0
0.0 1.0 0.0
Graph Decoding Algorithms
Chu-Liu/Edmonds Algorithm
DependencyTrees.GraphParsing.chu_liu_edmonds — Function
chu_liu_edmonds(G)Decode a dependency tree with the Chu-Liu/Edmonds algorithm, returning a (tree::DependencyTree, score) tuple.
Uses a recursive algorithm to predict a (possibly non-projective) maximum spanning tree for the graph G.
Further Reading
Eisner's Algorithm
DependencyTrees.GraphParsing.eisner — Function
eisner(G)Decode a projective dependency tree using the Eisner algorithm, returning a `(tree::DepependencyTree, score) tuple.
Uses a bottom-up chart parsing algorithm to decode the best possible projective tree from the arc scores in G.
Further reading
Cycle Detection
Tarjan's Algorithm
DependencyTrees.GraphParsing.tarjan — Function
tarjan(tree)Find strongly-connected components using Tarjan's algorithm.