This page covers some typical questions that might arise while using Timber. If you have problems that are not addressed here (or not addressed adequately), please let us know.
1.
The system says out
of disk space
2. The system says out of logging space
3. The system asks me to increase the page size
4. My query doesn't seem to work, but I'm sure it is a valid XQuery
6. What is database reorganization, and how should I do it in Timber?
7. What are the types of index supported by Timber, and how to choose which type(s) to use?
You will need to increase the volume size by changing the perf.server.*.device_quota in <basedir>/Timber/bin/resource/example.cfg file. The actual space a data file may take when loaded into a volume may be significant larger than the original size of the XML document, depending on the number of nodes it contains and indices build. Extra space on the volume is also needed for queries. In general, we recommend you to set up the volume size to be 10 to 20 times the data file to be loaded into the volume. The minimum recommended volume size is 50MB. Please note that to make any change in the configuration file effective, you will have to reload your dataset.
You will need to increase the volume size by changing the perf.server.*.device_quota in <basedir>/Timber/bin/resource/example.cfg file, or turn the log off by setting ?.server.*.sm_logging to be no. you will have to reload your dataset to make the changes effective. Please note that to make any change in the configuration file effective, you will have to reload your dataset.
You will need to change CONTAINER_SIZE
in the file <basedir>/Timber/bin/Resource/timber.settings
to be he recommended size in the error message you get. This is the container size used
in structural joins - when a specific ancestor has lots of descendants,
they are kept in a shore file with each record size almost CONTAINER_SIZE.
You do not have to reload your data to make yours changes in the Timber
setting file effective.
First please note that XQuery is case-sensitive, and all XQuery keywords (such as for, let) and function names should be in lower case. If any keyword or function name of you query is written in upper case, the XQuery parser of Timber may not be able to correctly interpret it.
Second, your query contains certain tag name that happens to be identical to a XQuery keyword, the XQuery parser of Timber may not be able to correctly interpret it.
Third, your query may not yet be supported by Timber. Timber supports a wide range of features in XQuery. But there are still certain types of XQuery that are not currently supported by Timber. For example, current version of Timber does not support the keyword distint. In addition, we do not support nested queries with multiple nestings or nested queries that are not connected with the outer part via a join (either value join or structural indicated by a path). If you are certain that your query is valid, but it does not seem to work in Timber (breaks the system, or runs forever, or output incorrect results), your query is not currently support by Timber yet. Please read here for XQuery Fragment Supported. If you think your query is supported by Timber, but still have problem with it, please report it to us.
If you get the soap server time out error message while using the GUI, you can increase the client side time limitation (330000 milliseconds by default) by choosing File -> Change Client timeout, or through the command line by adding the optional parameter timeoutmsecs (in milliseonds) in the following command:
<basedir>/XQueryparser/parserrun/bin/release/parserrun -x -i xquery -o xml [-t timeoutinmsecs]
Timber provides
update functionality by extending the XQuery language. When no
updates have been performed on a Timber volume, all nodes for a document
are laid out on the disk in document order. This is a key factor
in efficient query execution. When updates are performed we may be
required to put nodes into a separate unordered overflow portion of the
file. This happens when we insert new nodes, and when when a node is
modified so that its new value is larger than its previously stored
value. If the overflow portion is large enough, query performance
will suffer. To alleviate this problem, Timber provides a
reorganize function that will take a file containing an overflow
portion, and lay all of the nodes out on disk in document order.
Reorganizing the file from the command line is executed as follows,
where the document sbook.xml is the file to reorganize.:
timber -m reorganize -d sbook.xml
The types of index supported by Timber are listed below, along with brief explanation for each of them and example usage. An example index file can be found here.
Types of Index | Options | Build Command | Details |
Element index(e) | 't': element tag | sbook.xml s et //Build element tag index | Improve
query efficiency for queries retrieving elements by their tag names,
such as the following query |
'c': element content | sbook.xml s ec title // Build element content index on content of element book in sbook.xml | Improve
query efficiency for queries retrieving elements by their element value,
such as the query below. NOTE: One element content index is needed for
each element to be retrieved by content value. |
|
Attribute index (a): | 'n': attribute name, | sbook.xml s an // Build attribuate name index | Improve
query efficiency for queries retrieving elements by their attribute
names, such as the following query for $a in document("sbook.xml")//book where $a/@* = "Timber" return $a |
'v': attribute value, | sbook.xml s av // Build attribuate value index | Improve
query efficiency for queries retrieving elements by their attribute
values, such as the following query for $a in document("sbook.xml")//@editor |
|
'c': attribute content (needs optional value) | sbook.xml s av editor// Build attribuate content index | improve
query efficiency for queries retrieving elements by their attribute value,
such as the query below. NOTE: One attribute content index is needed for
each element to be retrieved by its attribute content value. for $b in document("sbook.xml")//book where $b/@editor = "Yunyaor" return $b |
|
Text index (t): | 'v' as text value | sbook.xml s tv // Build text value index | Improve query efficiency for retrieving text by its value similar to element content index without element name restriction, but the return type is text node. |
Inverted index (i): | 'e': index the leaf level element node | sbook.xml s ie // Build inverted index on leaf level element node???(TODO) | This is used by more
advanced evaluation involving IR, currently not support at XQuery user
level.
|
or 't': index the leaf level text node | |||
's':: stem | sbook.xml s itn // Build inverted index of leaf level content no-stemming | ||
'n': nostem | |||
Parent index (p): | no additional options | sbook.xml s p// Build parent index for all elements | This is used by more advanced evaluation such as meet iterator, currently not support at XQuery user level. |
Updatable element tag index (1): | no additional options | sbook.xml s 1// Build updatable element tag index | This index will help speed up the same qeries as an element tag index. While the element tag index will be updated by any of the XQuery update functions, the updatable element tag index (tagid) is better optimized for these updates and will be much faster. As a tradeoff, regular (non update) queries will be slightly slower for this index (as compared to the element tag index). Note: you would never want to have both an element tag index and an index of this type built on the same data set, since both indices would have to be updated when update queries were issued. |
Updatable attribute name index (2): | no additional options | sbook.xml s 1// Build updatable attribute name index | This index will help speed up the same queries as an attribute name index. The performance tradeoffs are the same as for the updatable element tag index above. Similarly, you would never want to build both this index and the attribute name index. |
Join index (j): | 'ec' elementcontent | sbook.xml d j ec page av number // Build join index //page = //@number | *
only [Double] is forced no matter [Int|String|Float|Double] is given * with 4 options [left side type] [left side tag/attribute name] [right side type] [right side tag/attribute name] Improve equality join queries according to left side and right side built. ac is similar to av but return element node and so currently not used by the Optimizer. Note that the mirror index is always built for example, leftside = rightside index will be coupled by rightside = leftside index. for $a in
document("sbook.xml")//book |
'av' attributevalue | |||
'ac' attributecontent |
The node keys in Timber are doubles. This allows Timber to insert a new node by assigning a key that lies between the two existing node keys. Unfortunately, after many updates, the available space (due to the precision of the doubles -- which is worse with larger numbers) may be exhausted. In this case, the update query will be aborted, and the device is left unchanged. Currently, Timber does not support renumbering of keys on the fly and the device (and indices) will need to be recreated if this error message occurs during a required insert query. Unfortunatley, the easiest way to do this is to dump the (Timber) device to an XML file and reload it (and any indices). For the sample data file, we would simply execute:
timber -m physical -f dump.xml -q ..\dump-query.txt
where dump-query.txt contains:
V,1,ADT,num,2
S,1,sbook.xml,0,DOCUMENT_NODE,THISNODE,1,1,XMLFILENAME,EQ,STR,sbook.xml,0,0,0
The resulting xml file dump.xml needs to have the file name removed from the first line, and then it can be reloaded into Timber:
timber -m load -r 1 -d dump.xml
Last Updated: 07/23/2004