Frequently Asked Questions (FAQ)

Contents

This page covers some typical questions that might arise while using Timber. If you have problems that are not addressed here (or not addressed adequately), please let us know.

    1. The system says out of disk space

      *   2. The system says out of logging space

*   3. The system asks me to increase the page size

*   4. My query doesn't seem to work, but I'm sure it is a valid XQuery 

*   5. Soap server time out

*   6. What is database reorganization, and how should I do it in Timber?

*   7. What are the types of index supported by Timber, and how to choose which type(s) to use?

 

 

1. The system says out of disk space

You will need to increase the volume size by changing the perf.server.*.device_quota in <basedir>/Timber/bin/resource/example.cfg file. The actual space a data file may take when loaded into a volume may be significant larger than the original size of the XML document, depending on the number of nodes it contains and  indices build. Extra space on the volume is also needed for queries. In general, we recommend you to set up the volume size to be 10 to 20 times the data file to be loaded into the volume. The minimum recommended volume size is 50MB. Please note that  to make any change in the configuration file effective, you will have to reload your dataset.

Back to top

2. The system says out of logging space

You will need to increase the volume size  by changing the perf.server.*.device_quota in <basedir>/Timber/bin/resource/example.cfg file, or turn the log off by setting ?.server.*.sm_logging to be no. you will have to reload your dataset to make the changes effective. Please note that  to make any change in the configuration file effective, you will have to reload your dataset.

Back to top

 

3. The system asks me to increase container size

You will need to change CONTAINER_SIZE in the file <basedir>/Timber/bin/Resource/timber.settings to be he recommended size in the error message you get. This is the container size used in structural joins - when a specific ancestor has lots of descendants, they are kept in a shore file with each record size almost CONTAINER_SIZE. You do not have to reload your data to make yours changes in the Timber setting file effective.

Back to top

4. My query doesn't seem to work, but I'm sure it is a valid XQuery

First please note that XQuery is case-sensitive, and all XQuery keywords (such as for, let)  and function names should be in lower case. If any keyword or function name of you query is written in upper case, the XQuery parser of Timber may not be able to correctly interpret it. 

Second, your query contains certain tag name that happens to be identical to a XQuery keyword, the XQuery parser of Timber may not be able to correctly interpret it.

Third, your query may not yet be supported by Timber. Timber supports a wide range of features in XQuery. But there are still certain types of XQuery that are not currently supported by Timber. For example, current version of Timber does not support the keyword distint. In addition, we do not support nested queries with multiple nestings  or nested queries that are not connected with the outer part via a join (either value join or structural indicated by a path).  If you are certain that your query is valid,  but it does not seem to work in Timber (breaks the system, or runs forever, or output incorrect results), your query is not currently support by Timber yet. Please read here for XQuery Fragment Supported. If you think your query is supported by Timber, but still have problem with it, please report it to us.

Back to top

5. Soap server time out

If you get the soap server time out error message while using the GUI, you can increase the client side time limitation (330000 milliseconds by default) by choosing File -> Change Client  timeout, or through the command line by adding the optional parameter timeoutmsecs (in milliseonds) in the following command: 

 <basedir>/XQueryparser/parserrun/bin/release/parserrun  -x -i xquery -o xml [-t timeoutinmsecs]

Back to top

6. What is database reorganization, and how should I do it in Timber?

Timber provides update functionality by extending the XQuery language.  When no updates have been performed on a Timber volume, all nodes for a document are laid out on the disk in document order.  This is a key factor in efficient query execution.  When updates are performed we may be required to put nodes into a separate unordered overflow portion of the file. This happens when we insert new nodes, and when when a node is modified so that its new value is larger than its previously stored value.  If the overflow portion is large enough, query performance will suffer.  To alleviate this problem, Timber provides a reorganize function that will take a file containing an overflow portion, and lay all of the nodes out on disk in document order.

Reorganizing the file from the command line is executed as follows, where the document sbook.xml is the file to reorganize.:

        timber -m reorganize -d sbook.xml

 

Back to top

7. What are the types of index supported by Timber, and how to choose which type(s) to use?

The types of index supported by Timber are listed below, along with brief explanation for each of them and example usage.  An example index file can be found here.

 Types of Index Options Build Command Details
 Element index(e)  't': element tag sbook.xml s et  //Build element tag index

Improve query efficiency for queries retrieving  elements by their tag names, such as the following query

for $a in document("sbook.xml")//author
return $a

'c': element content sbook.xml s ec title // Build element content index on content of element  book in sbook.xml

Improve query efficiency for queries retrieving elements by their element value, such as the query below. NOTE: One element content index is needed for each element to be retrieved by content value.

for $b in document("sbook.xml")//book
where $b/title = "Timber"
return $b

Attribute index (a): 'n': attribute name,  sbook.xml s an // Build attribuate name index Improve query efficiency for queries retrieving  elements by their attribute names, such as the following query

for $a in document("sbook.xml")//book
where $a/@* = "Timber"
return $a
'v': attribute value, sbook.xml s av // Build attribuate value index Improve query efficiency for queries retrieving  elements by their attribute values, such as the following query

for $a in document("sbook.xml")//@editor
where $a = "John"

return $a

'c': attribute content (needs optional value) sbook.xml s av editor// Build attribuate content index improve query efficiency for queries retrieving elements by their attribute value, such as the query below. NOTE: One attribute content index is needed for each element to be retrieved by its attribute content value.

for $b in document("sbook.xml")//book
where $b/@editor = "Yunyaor"
return $b
Text index (t): 'v' as text value sbook.xml s tv // Build text value index Improve query efficiency for retrieving text by its value similar to element content index without element name restriction,  but the return type is text node.
 Inverted index (i): 'e': index the leaf level element node sbook.xml s ie // Build inverted index on leaf level element node???(TODO) This is used by more advanced evaluation involving IR, currently not support at XQuery user level.

 

or 't': index the leaf level  text node
 's':: stem sbook.xml s itn // Build inverted index of leaf level content no-stemming
 'n': nostem
Parent index (p): no additional options sbook.xml s p// Build parent index for all elements This is used by more advanced evaluation such as meet iterator, currently not support at XQuery user level.
Updatable element tag index (1): no additional options sbook.xml s 1// Build updatable element tag index This index will help speed up the same qeries as an element tag index. While the element tag index will be updated by any of the XQuery update functions, the updatable element tag index (tagid) is better optimized for these updates and will be much faster. As a tradeoff, regular (non update) queries will be slightly slower for this index (as compared to the element tag index). Note: you would never want to have both an element tag index and an index of this type built on the same data set, since both indices would have to be updated when update queries were issued.
Updatable attribute name index (2):  no additional options sbook.xml s 1// Build updatable attribute name index This index will help speed up the same queries as an attribute name index. The performance tradeoffs are the same as for the updatable element tag index above. Similarly, you would never want to build both this index and the attribute name index.
Join index (j):  'ec' elementcontent sbook.xml d j ec page av number // Build join index //page = //@number  * only [Double] is forced no matter [Int|String|Float|Double] is given
* with 4 options [left side type] [left side tag/attribute name] [right side type] [right side tag/attribute name]

Improve equality join queries according to left side and right side built. ac is similar to av but return element node and so currently not used by the Optimizer. Note that the mirror index is always built for example, leftside = rightside index will be coupled by rightside = leftside index.

for $a in document("sbook.xml")//book
where $a/page = $a/@number
return $a

 'av' attributevalue
'ac' attributecontent

 
Back to top

8. While performing an 'insert' update query, I received the error message 'could not generate a new key... the range is too small', what should I do?

The node keys in Timber are doubles. This allows Timber to insert a new node by assigning a key that lies between the two existing node keys. Unfortunately, after many updates, the available space (due to the precision of the doubles -- which is worse with larger numbers) may be exhausted. In this case, the update query will be aborted, and the device is left unchanged. Currently, Timber does not support renumbering of keys on the fly and the device (and indices) will need to be recreated if this error message occurs during a required insert query. Unfortunatley, the easiest way to do this is to dump the (Timber) device to an XML file and reload it (and any indices). For the sample data file, we would simply execute:

 

timber -m physical -f dump.xml -q ..\dump-query.txt

 

where dump-query.txt contains:

V,1,ADT,num,2
S,1,sbook.xml,0,DOCUMENT_NODE,THISNODE,1,1,XMLFILENAME,EQ,STR,sbook.xml,0,0,0

 

The resulting xml file dump.xml needs to have the file name removed from the first line, and then it can be reloaded into Timber:

timber -m load -r 1 -d dump.xml


Back to top

Last Updated: 07/23/2004