Timber allows the (advanced) user to execute queries by entering the physical plans directly. In this way, a query plan can be optimized to execute more efficiently than a naively generated plan. Some query
functionalities that are not currently supported at Timber's XQuery level
are suppported at the physical plan level.
The NRE is a unique number that gets assigned to a collection of nodes upon
retrieval from indices or database. In some iterators, where nodes are being
created (e.g. construct), NRE gets assigned to the newly created nodes. Each
node in the witness tree will have an NRE. This NRE is used to reference the
node.
Iterator
|
Calling Format
|
Parameter Explanation
|
Index
Access
IndexAccess.h
GistIndexAccess.h
|
I,nre,indexName,filename,indexSource,indexType,value1,value2
|
NRE:
the nre to be assigned to nodes retrieved from the index.
IndexName: name of the index.
Filename: name of the file that
the index is built on.
IndexSource: GIST: if the index is a
gist index.
HASH: if index is a
hash index.
IndexType: INT: the index key is an
integer.
FLT: the index key is a
float.
STR: the index key is a
string.
DBL: the index key is a
double.
Value1: the key value you are
asking for.
Value2: if you are asking for a
range, this is the second value. This is optional.
|
Scan
Access
ScanIterator.h
SelectionCondition.h
|
S,nre,fileName,rootKey,nodeType,returnType,num1,num2,Left,opr,rightType,right,
…..,num3,start,end,…
|
NRE:
the nre to be assigned to
nodes retrieved from the database.
Filename:
name of the file to scan.
RootKey:
start key of the root element
in the file.
NodeType:
type of the node to scan for.
ALL_NODES: scan all nodes.
DOCUMENT_NODE: scan doc nodes.
ELEMENT_NODE: scan for element nodes.
ATTRIBUTE_NODE: scan for attribute nodes.
TEXT_NODE: scan for text nodes.
COMMENT_NODE: scan for comment nodes.
ReturnType:
type of the node to be returned as part of the result of
the scan.
THISNODE: return the node scanned.
PARENTNODE: return the parent of the scanned
node.
ELEMENTNODE: return the element node that is
closest to the scanned node (if attribute-node, get its
parent)
ATTRIBUTENODE: return the attribute node that is
closest to the scanned node (if elem-node,
get its attrs)
SUBTREE: return the subtree
of the scanned node.
Num1:
the number of disjunctive conditions you want to
apply on scanned nodes. Num2,Left,opr,rightType,right
should occur num1number of times.
Num2:
the number of conjunctive conditions in each disjunct.
Left,opr,rightType,right
should occur num2 number of times.
Left:
one of the following values: (check SelectionCondition.h
for more details)
VALUE,LENGTH,XMLFILENAME,DTDFILENAME,NODETAG,
CHILDNUMBER,
ATTRIBUTENUMBER, HAS_CHILD,
HAS_ATTRIBUTE,ATTRIBUTE_VALUE,ELEMENTCONTENT
If
you choose ATTRIBUTE_VALUE then pass here an attribute
name.
Opr:
one of the following values: EQ,LE,LT,GT,GE,NE: for
=,<,<=,>,>=,!=
CONTAIN,CONTAINEDBY,STARTWITH
RightType:
STR/INT/FLT
Right:
value you want to be checked. if rightType is STR,
you can pass <EMPTY> to conmpare against empty
content. otherwise, pass a value.
Num3:
number of scan ranges: start,end
should occur num3 number of times.
The scan will scan the sub-trees contained in these
start-end.
Start:
the
start key of the first node to scan.
End:
the
end key of the last node to scan.
|
Structural
Join
StackBasedAncs.h
StackBasedDesc.h
|
J,Algorithm,ancsNRE,descNRE,leftSiblingNRE
,
RightSiblingNRE,expectedDepth,relation,projectWhich,N
|
Alg:
A: if you want results to be sorted by ancs.
D: if you want results to be sorted by desc.
N: if you want a negated join (output ancs
that don’t have desc).
a: if you want an ancs-outer
join sorted by ancs (ancs
is output even if it doesn’t join).
AncsNRE:
the nre of the ancs
in the ancs input witness
tree.
DescNRE:
the nre of the desc
in the desc input witness
tree.
LeftSiblingNRE:
the nre of the left sibling
in ordered semantics. If you want unordered semantics,
Pass –1.
RightSiblingNRE:
the nre of the right
sibling in ordered semantics. If you want unordered
semantics, Pass –1.
ExpectedDepth:
the expected depth of the ancs.
I.e. how many ancs are
likely to be nested under other ancs
. If in doubt, pass
the depth of the XML document.
Relation:
A: ancs-desc relationship
between two inputs to the join.
P: parent-child relationship. The ancs
should be one level higher than desc.
ProjectWhich:
what you want to keep in the output of this join.
B: both ancs and desc
are kept.
A: only ancs is kept.
Valid when alg = A.
D: only desc is kept.
Valid when alg = D.
N:
this field is optional. If you pass an “N”, then
output of the join will be nested descendants. If you
don’t pass it or pass something else, output of join
will be ancs-desc pairs.
Valid only with stackBased
ancestor join.
|
Filter
FilterIterator.h
FilterCondition.h
|
f,num1,num2,app,lType,lNRE,lNum,lStr,opr,rType,rNRE,
rNum,
rStr
,IndexName,FileName…
|
Num1:
is the number of disjunctive conditions. Num2,lType,
….. should occur num1
number of times.
Num2:
number of conjunctive conditions in each disjunctive
condition. App,lType,lIndex,..
Should
occur num2 number of times.
App:
is one of the following
E: the tree will pass if EVERY node with nre
lNRE matches the condition.
AO: the tree will pass if AT LEAST ONE node with nre
lNRE matches the condition.
EO: the tree will pass if EXACTLY ONE node with nre
lNRE matches the condition.
EX: the tree will pass if EXACTLY ONE node has
the nre specified and it
matches the condition.
Note
that it is applied to the left part of the condition
only.
lType
:
type of the left value of the predicate.
V: left side is a value.
A: left side is an attr.
T: left side is text directly under node. (child
text)
DT: left side is text
anywhere under node. (desc
text)
L: left side is length.
LCV,LCA,LCT,LCDT,LCL:
value/attr/text/descText/length
of local child.
LAV,LAA,LAT,LADT,LAL:
value/attr/text/descText/length
of local ancs.
ACV,ACA,ACT,ACDT,ACL:
value/attr/text/descText/length
of actual child.
AAV,AAA,AAT,AADT,AAL:
value/attr/text/descText/length
of actual ancs.
SK: left side is a start key.
EK: left side is an end key.
LEV: left side is a level.
C: left side is a const.
LF: left side is local fan-out.
AF: left side is actual fan-out.
LD: left side is local sub-tree depth.
AD: left side is actual sub-tree depth.
CI: left side is a constant that we are getting
from an Iterator.
lNRE
:
nre of the left node in the
left input witness tree. This node we will be using its
value, attr, text… etc.
depending on lType.
lNum:
is the child number if lType
is one of the following:
LCV,LCA,LCT,LCDT,LCL,ACV,ACA,ACT,ACDT,ACL.
Or
it is how many levels up you would like to go to get an
ancestor if lType is one of
the following:
LAV,LAA,LAT,LADT,LAL,
AAV,AAA,AAT,AADT,AAL.
If
lType = C and opr
uses number comparison then lNum
holds the number to be compared against.
lStr:
is attrName if lType
is one of the following:
A,
LCA,LAA,ACA,AAA. Or if lType
is V and lNRE returns an
attribute node. Or if lType
is C and operation uses string comparison, then this
will hold the constant value to be compared against. if
you want to compare against an empty string, pass
<EMPTY> here.
opr:EQN,NEN,LTN,LEN,GTN,GEN:=,!=,<,<=,>,>=
using number comparison.
EQS,NES,LTS,LES,GTS,GES:=,!=,<,<=,>,>=
using string comparison.
S: left start with right. Strings.
B: left is at the beginning of right. Strings.
C: left contains right.
CD: left contained in right.
NS: left does not start with right.
NC: left does not contain right.
rType,
rNRE,rNum
,…
are exactly like the above except they are for the right
side of the condition.
IndexName:
if you want to use a join index to evaluate filter, pass
the name of the index here. otherwise, pass NULL.
FileName:
if you pass an indexName, you have to pass the xml file
name here. otherwise, pass NULL.
|
Sort
SortIterator.h
ExternalSortIterator.h
|
R,exp.Size,sortByWhat,num,nre,order
,whereEmptyGoes,…,X
,order,X
|
ExpectedSize:
is the number of trees we are getting as input.
SortByWhat:is
K:
to sort by start key
S: to sort by witness tree score.
T: to sort by the whole tree (start keys).
If
sorting by start key, give the following
parameters:
Num:
number of nodes you want to sort with. nre,order.whereEmptyGoes
should occur num number of times.
nre
:
nre of node in input witness
tree you want to be sorted with.
Order:
A: ascending order.
D: descending order.
WhereEmptyGoes:
where the
empty nodes go. So if you provide an nre and it returns no nodes,
this tree will be either put at the beginning or the end of the set
of trees to be sorted. It takes one of two values:
B: the empty ones will go at the beginning.
E: the empty ones will be placed at the end.
If
sorting by score or tree, give the following parameter:
Order:
A: ascending order.
D: descending order.
X:
this field is optional. If you pass an “X”, external
sort will be used. If you don’t pass anything or pass
something else, in-memory quick sort will be used.
|
Value
Sort
ValueSortIterator.h
ExternalValueeSortIterator.h
|
r,exp.Size,num,sortBy,NRE,order,attrName
,whereEmptyGoes…,X
|
ExpectedSize:
is the number of trees we are getting as input.
Num:
number of nodes you want to sort with. sortBy
, nre,order,attrName,whereEmptyGoes
should occur num number of times.
SortBy:
AN: sort by attribute value. Use number comparison.
AS: sort by attribute value. Use string
comparison.
TN: sort by text. Use number comparison.
TS:
sort by text. Use string comparison.
TG: sort by tag name.
VN: sort by value. Value is text if nre
is for a text
node
.
It is tagname if nre
is for an element node. It is attribute value if the nre
belongs to an attribute node. In the last case attrName
should be provided. Use number comparison.
VS: same as VN except use string comparison.
SK: sort by start key.
NRE:
NRE of element that you want to sort by its text or attr.
Order:
A: ascending.
D: descending.
AttrName:
name of attribute to be sorted by in the case of AN, AS,
NV and VS. Pass NULL if
You want to sort by TN or TS.
WhereEmptyGoes:
where the empty nodes go.
So if you provide an nre
and it returns no nodes or you give an attribute name
that is not in there, this tree will be either put at
the beginning or the end of the set of trees to be
sorted. It takes one of two values:
B: the empty ones will go at the beginning.
E: the empty ones will be placed at the end.
X: this
field is optional. If you pass an “X”, external sort will be used. If
you don’t pass anything or pass something else, in-memory quick sort
will be used. |
Set
Ops.
UnionIterator.h
–
IntersectionIterator.h – DifferenceIterator.h
|
T,whichOp,length
|
WhichOp: U: union
I: intersection.
D: difference.
Length: starting from root, how many
nodes you want to compare. Optional. If not specified, deep equality is used.
|
Group
by
GroupByIterator.h
|
G,exp.Sz,groupByWhat,GroupByNRE,GroupByAttrName,keepTrees,Num,opr,
oprOnWhat,attrName,NRE,opNRE
….rootNRE,valueNRE,sort
|
ExpectedSize:
is the number of trees we are getting as input.
GroupByWhat:
AN: attribute. Use number comparison.
AS: attribute. Use string comparison.
TN: text. Use number comparison.
TS: text. Use string comparison.
SK: start key.
VN: value. if groupBy NRE is an element, then we group
by tag. if groupBy NRE is an attr, then we group by
attribute value of attribute groupByAttrName. if groupBy
NRE is text, then we group by value of text. Use number
comparison.
VS: value. if groupBy NRE is an element, then we group
by tag. if groupBy NRE is an attr, then we group by
attribute value of attribute groupByAttrName. if groupBy
NRE is text, then we group by value of text. Use string
comparison.
GroupByNRE:
the NRE of the element you want to group by its attr,
text or start key.
groupbyAttrName
:
is the attribute name of the node if you are grouping by
attribute value. Otherwise, pass NULL.
KeepTrees:
0: do not keep the input trees. Just pass the grouped by
value.
1: keep the input trees and pass the grouped by
value.
Num:
number of operations to be performed on the grouped
trees. Opr, oprOnWhat,
attrName,nre,opNRE
should occur num number of times.
Opr:
Operation to be performed on the grouped trees.
C: count
A: avg
S: sum
M: max
m: min
N: none
OprOnWhat:
On what you want to perform the operation.
A: attribute. Number comparison.
LA: local attr. not from DB. number omparison.
T: text. Number comparison.
LT: local text. not from DB. Number
comparison.
LF:
local fanout.
AF: actual fanout.
LD: local depth.
AD: actual depth.
V: value. if NRE is an element, then we perform
operation on tag. if NRE is an attr, then we perform
operation onattribute value of attribute AttrName. if
NRE is text, then we perform operation on value
of text. Use number comparison.
AttrName:
name of attribute if AN or
AS. NULL otherwise.
NRE:
NRE of node in input trees that you want to perform
operation on its text, attr,
fanout or depth.
opNRE
:
the nre to be assigned to
the node holding the operation result value.
rootNRE
:
the NRE to be assigned to the dummy root added to the
group by tree.
valueNRE
:
the
NRE to be assigned to the dummy node holding the group
by value (it is the first child of the root).
Sort:
0/1. if you want the groupby
to sort by grouped by value, pass 1. if
results are already sorted by grouped by value, pass 0.
|
Function
FunctionIterator.h
|
F,num,opr,oprOnWhat,attrName,NRE,opNRE
….treeLevel
|
Num:
number of operations to be performed on input trees. Opr,
oprOnWhat, attrName,nre,opNRE
should occur num number of times.
Opr:
C: count
A: avg
S: sum
M: max
m: min
OprOnWhat:
A: attribute. Number comparison.
LA: local attr. not from DB. number comparison.
T: text. Number comparison.
LT:
local text. not from DB. Number comparison.
LF: local fanout.
AF: actual fanout.
LD: local depth.
AD:
actual depth.
V: value. if NRE is an element, then we perform
operation on tag. if NRE is an attr, then we perform
operation onattribute value of attribute AttrName. if
NRE is text, then we perform operation on value
of text. Use number comparison.
AttrName:
name of attribute if AN or
AS. NULL otherwise.
NRE:
NRE of node in input trees that you want to perform
operation on its text, attr,
fanout or depth.
opNRE
:
the nre to be assigned to
the node holding the operation result value.
treeLevel
:
if you want the operations to be performed on the tree
level pass 1. on the input
level pass 0. The difference is that in the case of the
input level, the operations are performed across all
input trees and one output tree is generated holding the
operation result values. In the tree level case, the
operations are performed on every tree and the result
nodes are appended to the result.
|
Value
join
ValueJoinIterator.h
|
j,rootNRE,leftNRE,leftTag,rightNRE,rightTag,expSize,attrNameLeft,attrNameRight,
opr,joinWhatLeft,joinWhatRight,sortedInput,N,O,A
|
RootNRE:
the nre to be assigned to
the dummy root added to the joined tree.
LeftNRE:
NRE of the node in the left input trees that will be
checked for join.
LeftTag:
the tag of the element in the left
input tree to perform value join on. If you know
its NRE, pass NULL here and pass the NRE in leftNRE.
If leftTag is not NULL, leftNRE
is ignored.
RightNRE:
NRE of the node in the right input trees that will be
checked for join.
RightTag:
the tag of the element in the right
input tree to perform value join on. If you know
its NRE, pass NULL here and pass the NRE in rightNRE.
If rightTag is not NULL, rightNRE
is ignored.
Exp.
Size:
expected number of trees from the right input (join will
block on right input).
AttrNameLeft:
if joining made based on an attr
value, then put attr Name of
the left input here. NULL otherwise.
AttrNameRight:
if joining made based on an attr
value, then put attr Name of
the right input here. NULL otherwise.
Operator:
EQN:
equality join. Use number
comparison
NEN: non-equality join.
Use number comparison
LTN: if left is < right, join. Use number
comparison
LEN: if left is <= right, join. Use number
comparison
GTN: if left is > right, join. Use number
comparison
GEN: if left is >= right, join. Use number
comparison
EQS: equality join.
Use string comparison
NES: non-equality join.
Use string comparison
LTS: if left is < right, join. Use string
comparison
LES: if left is <= right, join. Use string
comparison
GTS: if left is > right, join. Use string
comparison
GES: if left is >= right, join. Use string
comparison
C: if left CONTAINS right. String comparison.
CB: if left CONTAINED BY right. String
comparison.
JoinWhatLeft:
A: join by attr.
T: join by text.
DT:
join by text not directly underneath node. but even desc
text.
S: join by start key.
E: join be end key.
L: join by level.
N: join by nothing. Cartesian product.
TR: join by subtree rooted at leftNRE. Tags, content,
and attributes are to be compared.
TNR: join by subtree rooted at leftNRE. Tags, content, and attributes
are to be compared except the root tag.
V: value of the node leftNRE
if it is an element node, then tag name is used
. if it is an attr
node, then attribute attrNameLeft
is used. If it is a text node, then text is used.
JoinWhatRight:
A: join by attr.
T: join by text..
DT: join by text not directly underneath node. but even
desc text.
S: join by start key.
E: join by end key.
L: join by level.
N: join by nothing. Cartesian product.
TR: join by subtree rooted at rightNRE. Tags, content,
and attributes are to be compared.
TNR: join by subtree rooted at rightNRE. Tags, content,
and attributes are to be compared except the root tag.
V: value of the node rightNRE
if it is an element node, then tag name is used
. if it is an attr
node, then attribute attrNameRight
is used. If it is a text node, then text is used.
SortedInput:
0: if the input is not sorted by the join value.
(nested-loops join is performed)
1: if the input is already sorted by join value.
(sort-merge join is performed)
IndexName:
if you want to use a join index to evaluate this join, pass the name
of the index here. otherwise, pass NULL.
FileName: if you
pass an indexName, you have to pass the xml file name here. otherwise,
pass NULL.
N:
this parameter is optional. If you pass N, then the output of the join
will be nested by the left input. I.e. each left input tree will have
all joining right input trees with it in the output tree.
O:
this parameter is optional. If you pass O, then the join
will be left outer join. I.e. each left input tree will
be output whether or not it joined with a right input
tree..
A:
this parameter is optional. if you pass it, at least one of all the
nodes with NRE leftNRE will need to match at least one of the nodes
with NRE rightNRE. This works only with nested-loops join. |
Index Hash Nested Value Join
IndexHashValueJoinIterator.h
|
H,expSize,indexName,filename,leftNRE,rightNRE,rootNRE,O,A
|
Exp. Size:
expected number of trees from the right input (join will
block on right input).
IndexName:The
name of the value join index. it is the index to which
we will be submitting start keys of node with leftNRE in
input trees and get right nodes
FileName:
The name of the XML file where right trees come from.
LeftNRE:
NRE of the node in the left input trees that will be
checked for join.
RightNRE:
NRE of the node in the right input trees that will
be checked for join.
RootNRE:
the nre to be assigned to
the dummy root added to the joined tree.
O:
this parameter is optional. If you pass O, then the join
will be left outer join. I.e. each left input tree will
be output whether or not it joined with a right input
tree..
A:
this parameter is optional. if you pass it, at least one
of all the nodes with NRE leftNRE will need to match at
least one of the nodes with NRE rightNRE
|
One Sided Value Join
OneSidedValueJoinIterator.h
|
O,indexName,filename,leftNRE,assignedRightNRE,rootNRE,N,O
|
IndexName:The
name of the value join index. it is the index to which
we will be submitting start keys of node with leftNRE in
input trees and get right nodes
FileName:
The name of the XML file where right trees come
from.
LeftNRE:
NRE of the node in the left input trees that will be
checked for join.
assignedRightNRE:
NRE to be assigned to the nodes read from the index.
RootNRE:
the nre to be assigned to
the dummy root added to the joined tree.
N:
this parameter is optional. If you pass N, then the output of the join
will be nested by the left input. I.e. each left input tree will have
all joining right input trees with it in the output tree.
O:
this parameter is optional. If you pass O, then the join
will be left outer join. I.e. each left input tree will
be output whether or not it joined with a right input
tree.
|
Child
Chooser
ChildChooserIterator.h
|
c,whichChild,num,parentNRE...
|
WhichChild:
which child you want to pass. 1: first child. 2:second
child,…. –1:last child. So this Iterator expects
input to have ancs-desc
pairs sorted by ancs. And it
will pass for each ancs, the
i th occurrence.
num:
number of parents you want to pass their children.
parentNRE should occur num number of times.
ParentNRE:
the NRE of the ancs in the
input tree.
What
this iterator does is for a group of ParentNREs, it will
pass the whichChild occurance in the input stream. input
should be sorted on parentNREs key.
|
Duplicate
Eliminator
DuplicateEliminatorIterator.h
|
D,byWhat,NRE,exp.Size,attrName,sort
|
ByWhat:
SK:
eliminate trees with duplicate start key (id) of a
specific node.
TR: eliminate duplicate trees based on start
keys. (whole tree is checked)
TS: eliminate trees with duplicate text values.
String comparison.
TN: eliminate trees with duplicate text values.
Number comparison.
AS: eliminate trees with duplicate attr
values. String comparison.
AN: eliminate trees with duplicate attr
values. Number comparison.
NRE:
nre of the node to eliminate
duplicates by. If eliminating duplicates by tree, this
is ignored. If by start key, then the start key of this nre
is used in eliminating duplicates. If by text, the text
of the node with this nre is
used. If by attr, then the
attribute of this node is used.
ExpectedSize:
is the number of trees we are getting as input. Pass
whatever if sorting is turned off.
AttrName:
if byWhat =AS or AN, then
this is the name of the attribute that you want to
eliminate duplicates of.
Sort:
0: if you do not want it to sort input. (it
is already sorted).
1: if you want the Iterator to sort input before
eliminating duplicates.
X: this field is optional. If you
pass an “X” and sort=1, external sort will be used.
If you don’t pass anything or pass something else and
sort=1, in-memory quick sort will be used.
|
Phrase
Finder
PhraseFinderIterator.h
|
n,NRE,indexName,filename,phrase
|
NRE:
the nre to be assigned to the output nodes.
IndexName: is the name of the index
to look up when searching for the words in the sentence. It should be some
kind of an inverted index or so.
Filename:
name of the file the index is built on.
Phrase: this is the phrase you
want to find in the document. You can have “*” anywhere in the phrase to
represent wildcards. For example, phrase: “hello ** world”
Will
match “hello bad ugly world”, “hello blah blah world” but not “hello world”
or “hello blah world”. Also, if you want to make sure that hello is the
second word in the text, you should do “* hello world” third :”** hello
world”.
|
Construct
Iterator
ConstructIterator.h
|
s,n1,lev,type,src,NRE,str1
str1,str2
str1,rootNRE,n2,ns,lev,rel,getWhat,attrName….
rootNRE,n2,ns,lev,rel,getWhat,attrName….
rootNRE,0,rootGetWhat,attrName
|
n1:is
the number of nodes in the output tree (attribute is a
node, element is a node and content is a node). Lev,type,src,…
are repeated n1 number of times.
Lev:
is the level of the node in the output tree (attr
is 1 level deeper than its element)
Type:
is the type of the node in the output tree:
E: element node.
A: attribute node.
C: content node.
OA: optional attribute. just like attribute except that
if it doesn't have a value, it will not apper in teh
output.
Src:
source of data for this output node.
C: constant. I.e. data to be placed there should
be specified now.
R: reference. I.e. a pattern tree is used with a
reference root (exp. $a).
NRE:
is the nre to be assigned to
extra nodes. If there are nodes to be retrieved from the
database, then this nre will
be assigned to them in case type = C. if type = E or A,
then this nre is assigned to
the new node formed whether it is an element or attr
node.
Now,
there are 4 different information formats you need to
provide.
1-
If src is constant.
Str1:
should be the constant that will be the element name if
the type is element or
Text if the type is content.
Str1,str2:
str1 should be the attribute name and sr2 is attribute
value if type is attribute node.
2-
If src is reference.
str1,rootNRE,n2,ns,lev,rel,getWhat,attrName….
If type is attribute.
str1:
is the name of the attribute
RootNRE:
is the nre of the root of
the pattern tree node in the input witness trees.
n2:
is the number of nodes in the pattern tree. Level,
relation, getWhat should be
repeated num-1 number of times. Level,relation,getWhat
will describe nodes of the pattern tree.
The nodes will be listed in a depth-first
pre-order manner.
Ns:
node source: if I, then index accesses are used to
retrieve nodes. If W, then the witness tree is scanned
for nodes. . If ns = I then you should pass in the
following lines a bunch of index access. If ns=W, then
after lev an extra field is
required, tagName.
Level:
the level of the node in the pattern tree.
tagName
:
pass this field only if ns = W. otherwise, don’t pass
anything here.
Relation:
the relationship with its parent in the pattern either
A: ancs-desc or P:
parent-child.
GetWhat:
is
what you want to get exactly for this node. It is
ignored for internal nodes. Works only for leaf nodes.
It can be one of the following:
S: gets the subtree
rooted at this node.
T: gets the text of the node.
I: gets the node itself only.
V: gets value of the node. If it is an element
node, the tag name is returned. If it is an attribute
node, the attribute value of attrName
is returned. If it is a text node, the text is returned.
If it is a document node, the document name is returned.
A: gets an attribute of the node. In this case,
you need to add a field to
the string just after
,A, which is the attr name
that you want returned.
L: gets the local subtree
of the node. The difference between this
option and the ‘S’ option is that the ‘S’
will get you the subtree of
the node from the database. While the ‘L’ option
gets it from the witness tree.
attrName
:
is the name of the attribute to be retrieved if getWhat
= A or getWhat=V. In the case of the V, if the node
is not an attribute node, pass NULL. otherwise, don’t
pass anything here.
NOTE:
if ns = I, then
the following num-1 lines to this line should be index
accesses or scan lines. Each line corresponds to a
pattern tree node in a depth-first pre-order traversal.
rootNRE,n2,ns,lev,tagName,rel,getWhat,attrName….
If type is element or content.
RootNRE:
is the nre of the root of
the pattern tree node in the input witness trees.
n2:
is the number of nodes in the pattern tree. Level,
relation, getWhat should be
repeated num-1 number of times. Level,relation,getWhat
will describe nodes of the pattern tree.
The nodes will be listed in a depth-first
pre-order manner.
Ns:
node source: if I, then index accesses are used to
retrieve nodes. If W, then the witness tree is scanned
for nodes. . If ns = I then you should pass in the
following lines a bunch of index access. If ns=W, then
after lev an extra field is
required, tagName.
Level:
the level of the node in the pattern tree.
tagName
:
pass this field only if ns = W. otherwise, don’t pass
anything here.
Relation:
the relationship with its parent in the pattern either
A: ancs-desc or P:
parent-child.
GetWhat:
is
what you want to get exactly for this node. It is
ignored for internal nodes. Works only for leaf nodes.
It can be one of the following:
S: gets the subtree
rooted at this node.
T: gets the text of the node.
I: gets the node itself only.
V: gets value of the node. If
it is an element node, the tag name is returned. If it
is an attribute node, the attribute value of attrName
is returned. If it is a text node, the text is returned.
If it is a document node, the document name is returned.
A: gets an attribute of the node. In this case,
you need to add a field to
the string just after
,A, which is the attr name
that you want returned.
L: gets the local subtree
of the node. The difference between this
option and the ‘S’ option is that the ‘S’
will get you the subtree of
the node from the database. While the ‘L’ option
gets it from the witness tree.
attrName
: is the name of the attribute to be retrieved if
getWhat = A or getWhat=V. In
the case of the V, if the node is not an
attribute node, pass NULL. otherwise, don’t pass
anything here.
NOTE:
if ns = I, then
the following num-1 lines to this line should be index
accesses or scan lines. Each line corresponds to a
pattern tree node in a depth-first pre-order traversal.
rootNRE,0,rootGetWhat,attrName.
This is a special case of the previous case. This is
when you want to output the root or information
regarding the root.
RootNRE:
is the nre of the node in
the input witness trees that you want to retrieve its
data.
RootGetWhat:
is
what you want to get exactly for this node. It can be
one of the following:
S: gets the subtree
rooted at this node.
T: gets the text of the node.
I: gets the node itself only.
V: gets value of the node. If it is an element
node, the tag name is returned. If it is an attribute
node, the attribute value of attrName
is returned. If it is a text node, the text is returned.
If it is a document node, the document name is returned.
A: gets an attribute of the node. In this case,
you need to add a field to
the string just after
,A, which is the attr name
that you want returned.
L: gets
the local subtree of the
node. The difference between this
option and the ‘S’ option is that the ‘S’
will get you the subtree of
the node from the database. While the ‘L’ option
gets it from the witness tree.
attrName
: is the name of the attribute to be retrieved if
getWhat = A or getWhat=V. In
the case of the V, if the node is not an
attribute node, pass NULL. otherwise, don’t pass
anything here.
|
Stack-based
Term Join
TmpFile.h
|
t,NRE,indexName,filename,phrase,function1,function2,function3,depth,simpleScore,
keywordset,meetAlg,parentIndexName
|
NRE:
the nre to be assigned to
the output nodes.
IndexName:
is the name of the index to look up when searching for
the words in the sentence. It should be some kind of an
inverted index or so.
Filename:
name
of the file the index is built on.
Phrase:
this is the phrase you want to find its word in the
document. And common ancestors of these words will be
returned.
Function1:
score function. The number you pass here is the number
of the function in a switch statement. Basically, you
write your function, add a case with a number to the
first switch statement in the EVALUATION_OP_SBTERMJOIN
case. The function should take in a char * and return a
double.
Function2:
ancestor depth threshold function. The number is the
same as above but to the second switch statement. The
function you use should take in Keytype
and int and return an
integer, which is the number of level to go up in the
document. If –1 is returned the function does nothing.
Function3:
ancestor qualifies function. The number is like the
above except that it is the third switch statement in
the EVALUATION_OP_SBTERMJOIN
case. The function should take a double in and returns a
true or false.
Depth:
is the expected
depth of the document.
SimpleScore:
pass a 0 or 1 value here. 0 if you want the complex
score to be calculated for each ancestor or 1 if you
want a simple score that counts the number of
occurrences of each term per ancs.
Keywordset:Cong
stuff. Just pass 0.
MeetAlg:
pass 0 or 1. 0 if you want the stack based alg
to be performed. 1 if you want the meet alg
to be performed.
ParentIndexName:
optional. If passed the parent index is used in getting
ancestors. Otherwise, accessing the database.
|
Navigational
Get Relative
NavigationalGetRelative.h
|
V,nre,relation,num,nre2
|
NRE:
nre of the node you want to
get its relative.
Relation:
one of the following:
1A: gets a specific ancestor.
1C: gets a specific child.
AAO: gets all ancestors and outputs them one at a
time.
ADO
: gets all descendants and outputs them one at a time.
ACO: gets all children and outputs them one at a
time.
AAOI: gets all ancestors and outputs them one at
a time.
Including
the node itself.
ADOI: gets all descendants and outputs them one
at a time. Including the node itself.
ACOI: gets all children and outputs them one at a
time. Including the node itself
AAT: gets all ancestors and outputs them all at
once in the same output tree.
ADT: gets all descendants and outputs them all at
once in the same output tree.
ACT: gets all children and outputs them all at
once in the same output tree.
AT: gets all text nodes under node index.
CT: gets
children text nodes under node index.
A: gets attribute node of node index.
Num:
is used only when relation is one of the following:
1A: num is used to specify which ancestor is
returned in terms of levels higher. So if num=1, the
immediate parent is returned. Num=2, the grandparent is
returned and so on.
1C: num is used to specify which child to be
returned. Num=0, first child is returned, num =1 second
child,… num=-1, last child
is returned.
NRE2:
nre to be assigned to the
retrieved relative(s).
|
FileWriter
FileWriterIterator.h
|
w,filename
|
Filename: the name of the file that
you want your query results to be written to. This cannot be a leaf node.
Place it anywhere in the query plan except at leaves.
|
FileReader
FileReaderIterator.h
|
d,filename
|
Filename: the name of the file that
you want to read from. This must be a leaf node. You need to write query
results using FileWriter before
using fileReader to read them.
|
Pick
Iterator
PickIterator.h
|
K,function1,function2,depth,sort
|
Function1: worth outputting
function. This function determines whether
or not a node is worth outputting. The number you pass here is the number of
the function in a switch statement. Basically, you write your function, add a
case with a number to the first switch statement in the EVALUATION_OP_PICK case. The function should take in a witnessTree *
and return a bool.
Function2: percentage function. Returns the percentage of children of a node that if these
children are worth outputting, the parent is output instead. The number is
the same as above but to the second switch statement. The function you use
should take in nothing and return a float.
Depth: is the expected depth of the document.
Sort: if 1, the input will be sorted in the Pick
iterator. If 0, the input is assumed to be sorted by start key.
|
Data
Instantiation
DataInstatiationIterator.h
|
i,index,scope,cutDepth,getElement,getAttr,getContent
|
Index: the index of the node in
the witness tree that you want to instantiate its subtree.
Scope: I: get node itself
S: get whole subtree
C: get whole subtree until a given depth
CutDepth: applicable if scope = C.
pass the depth to which you want the subtree to be instatiated.
GetElement: if 0, no element nodes in
the subtree are returned. If 1, element nodes are returned.
GetAttr:
if 0, no
attr nodes in the subtree are returned. If 1, attr nodes are returned.
GetContent:
if 0, no
content nodes in the subtree are returned. If 1, content nodes are returned.
|
SortStopK
SortStopKIterator.h
|
k,K,inSize
|
K:
number of
outputs. This iterator returns top k scoring inputs.
inSize
: number of inputs to be
read by this iterator. Among these inputs, top k scoring ones will be output
in order of descending score.
|
Projection
ProjectionIterator.h
|
P,num,NRE,..,preserveNodeOrder
|
Num:
number of NRE you want to keep in the tree.
NRE:
NRE of nodes to be kept. NRE should be repeated num number of times.
PreserveNodeOrder:
if 1: nodes are kept in the output tree in their original order.
0: nodes are kept in the
output tree in the order of the input NRE.
|
Universal
Quantification
|
u,rootNRE,num,relation
,…,compareWhat,attrName,operation,number,str
|
rootNRE
:
the
nre of the node you want to
perform universal quantification on. For example, if you
want to get books that have all authors affiliated with
university
of
Michigan
. Then, nre of book in the
input trees should be passed here.
num:
is
the number of nodes in the pattern. So, to follow up on
the previous example. the
pattern is author
with child/desc affiliation.
So that gives us 2 for num. the following num lines in
the evaluation plan should be index accesses to pattern
nodes. In our example, the following 2 lines to the
universal quantification line should be index access to
author and then index access to affiliation.
relation:
relation
is repeated num number of times. For each node in the
pattern, the corresponding relation is either P or A
(parent-child or ancs-desc).
This tells us the relation of the node with what comes
before it in the pattern. For the root of the pattern,
relation is its relationship with the rootNRE
node. To follow up on our example, you should pass 2
relations, one is for book-author and the other is for
author-affiliation.
compareWhat:
it
is one of two values:
T: if you want text of the leaf of the pattern to
be checked.
A: if you want an attribute of the leaf of the
pattern to be checked.
In
our example, we want to check the affiliation text and
make sure it is equal to
university
of
Michigan
.
attrName
:
if
you use the A option in the previous parameter, then you
should pass an attribute name. Otherwise, pass NULL.
operation:EQN,NEN,LTN,LEN,GTN,GEN:=,!=,<,<=,>,>=
using number comparison.
EQS,NES,LTS,LES,GTS,GES:=,!=,<,<=,>,>=
using string comparison.
C: contains. String comparison.
S: starts with. String comparison.
In
our example, we should use EQS because we want the affilation
text be equal to “
university
of
Michigan
”.
number:
if
you use any of the number operations, then pass a number
to compare against. Otherwise, pass -1.
str
:
if
you use any of the string operations, then pass a
string. Otherwise, pass NULL. In our example, pass
“University of Michigan” here.
|
MLCA (Meaningful Lowest Common Ancestor)
meaningfulClosestCommon
AncestorStructure.h
|
m,num,NRE1,NRE2,...,NREnum,assignedNREforRoot,expectedSize,expectedDepth
|
Num: number of leaf nodes you want to keep in the MLCAS
(Meaningful Lowest Common Ancestor Structure)
NREi: NRE value of the ith leaf node
assignedNREforRoot: assigned NRE for the root node of each MLCAS
expectedSize
: the max of the estimated size of all the
inputs. Enter -1 if unknown.
expectedDepth
: the depth of the lowest nodes from the
inputs. Enter -1 if unknown
|
Update
UpdatesIterator.h
|
U,ModTNode,ElementNRE,CONST,constantValue
U,ModTNode,ElementNRE,REF,ElementRefNRE
U,ModTNode,ElementNRE,REF,DummyNodeNRE
U,ModTNode,ElementNRE,REF,AttributeNRE,AttributeName
|
Modifying
a text node:
ModTNode
will modify the text value of the element with NRE=ElementNRE.
Only one element node per witness tree should have an NRE=ElementNRE,
otherwise an error will be issued, and the transaction will
be aborted. The new text value will become the last child of
the element (and all previous text nodes for this element
will be removed). When using the REF option, the NREs
for ElementRefNRE, DummyNodeNRE, and AttributeNRE
should also have only one node with this NRE in the witness
tree. If this is not the case, the query will be carried out
(by assigning the first of such values as the new text value
for the element node with NRE=ElementNRE) but a warning will
be issued.
Note: all subsequent update operators have these same
restrictions. The first NRE, which specifies the variable to
modify must be a singleton (only one in the witness tree),
all subsequent NREs may be mapped to more than one node in a
given witness tree, but only the first of such values will
be used for the update.
...,CONST,constantValue: constantValue will be
the new text value.
...,REF,ElementRefNRE: the text value for the
element with NRE = ElementRefNRE will be the new text value.
...,REF,DummyNodeNRE: the value contained in the
"Dummy Node" (e.g.
generated by the Function Iterator) with NRE=DummyNodeNRE is
the new text
value.
...,REF,AttributeNRE,AttributeName: the value
associated with the
attribute AttributeName is the new text value (an attribute
node stores
all of the attribute name/value pairs associated with a
given element, so
AttributeNRE by itself would specify a group of attributes).
|
U,ModANode,AttributeNRE,AttributeName,CONST,constantValue
U,ModANode,AttributeNRE,AttributeName,REF,ElementRefNRE
U,ModANode,AttributeNRE,AttributeName,REF,DummyNodeNRE
U,ModANode,AttributeNRE,AttributeName,REF,AttributeRefNRE,AttributeName
|
Modifying
an attribute value:
ModANode will modify the value associated with
AttributeName (for the
attribute node with NRE=AttributeNRE).
As with ModTNode, both CONST and REF values
can be assigned to the
attribute value, and the descriptions of these options are
the same as for
ModTNode. If the attribute AttributeName does
not exist, an attribute
value pair is not inserted.
|
U,DelTNode,ElementNodeNRE
|
Deleting
text node(s):
DelTNode
deletes all of the text content from the element node
specified
by ElementNodeNRE.
|
U,DelAttrValue,AttributeNRE,AttributeName
|
Deleting
an attribute value:
DelAttrValue
deletes the attribute value associated with AttributeName,
from the attribute node with NRE=AttributeNRE.
|
U,DelTree,ElementNRE
|
Deleting
a subtree (rooted at an element node):
DelTree
deletes the subtree rooted at the element with NRE=ElementNRE.
All descendent elements are deleted, as well as their
associated text and
attributes.
|
U,InsAttr,ElementNRE,AttributeName,CONST,constantValue
U,InsAttr,ElementNRE,AttributeName,REF,ElementRefNRE
U,InsAttr,ElementNRE,AttributeName,REF,DummyNodeNRE
U,InsAttr,ElementNRE,AttributeName,REF,AttributeRefNRE,AttributeName
|
Inserting
an attribute:
InsAttr
inserts an attribute (and possibly an associated value) as
the
last attribute of the element with NRE=ElementNRE.
The value of the attribute is given as either a constant
value, or a
reference (in the same way as ModTNode).
Note: 'U,InsAttr,ElementNRE,AttributeName,<NOVAL>'
is used to insert an
attribute without a corresponding value.
|
A.
Inserting an element node with no associated
text/content:
U,InsENode,ElementNRE,newChildElementTag,<NOVAL>
B. Inserting an element node with a constant value='val'
for the child text node:
U,InsENode,ElementNRE,newChildElementTag,CONST,val
C. Inserting an element node with a reference value for
the child text node
(note: X is not used, but must be included):
U,InsENode,ElementNRE,newChildElementTag,REF,ElementNodeNRE,X
U,InsENode,ElementNRE,newChildElementTag,REF,DummyNodeNRE,X
U,InsENode,ElementNRE,newChildElementTag,REF,AttributeNRE,AttributeName
|
nsENode
inserts a new element as a child of the element having
NRE=ElementNRE, with (optional) text and attribute values.
We
can include arguments for attribute insertion after any of
the
five lines in 'A', 'B', and 'C',
These attribute arguments are given as 0 or more repetitions
of the
following five patterns (where X is not used, but must be
included):
,AttributeName,<NOVAL>
,AttributeName,CONST,constantValue
,AttributeName,REF,ElementRefNRE,X
,AttributeName,REF,DummyNodeNRE,X
,AttributeName,REF,AttributeRefNRE,AttributeName
|