Shore/Berkeley
DB Configuration
Depending on the size of the data file you want to load into Timber, and your other query needs, you may want to change the maximum size of the device allowed in Timber. You may also want to turn on/off logging, change the default device name, and so on. You may do so by changing the configuration file for the underlying data management system (for both Shore and Berkeley DB). The main content of the example configuration file is shown in the following table. Typical cases requiring changes to the default configuration can be found in the FAQ page.
By default, this configuration file is located at <basedir>/TimberRoot/bin/Resource/example.conf, where TimberRoot is the root folder of Timber. If you wish, you may change the location of the configuration file as well by changing the parameter SHORE_CONFIG_FILE. For details about this, please refer to Timber Configuration.
Important Note: Except for turning on/off logging, you will have to reload your dataset to make the changes in the configuration file effective.
# set the location of the diskrw program # NOTE: This must be changed to point to your installed bin directory ?.server.*.sm_diskrw: ../../sthread/diskrw # set the buffer pool size and log directory for all example servers #?.server.*.sm_bufpoolsize: 262144 ?.server.*.sm_bufpoolsize: 131072 ?.server.*.sm_logdir: ../log # Specify the log directory for all server programs ?.server.*.sm_logging: yes # use sm_reformat_log to reinitialize a log that is a raw device *.server.*.sm_reformat_log: yes # Need 100000 to get through the scripts with 32KB pages ?.server.*.sm_logsize: 100000 perf.server.*.device_name: ../volumes/dev1 |
Depending on your data files and
query tasks, you may want to tune Timber for your specific needs. You can get optimized query execution speed by changing the page size, initial buffer size,
buffer replacement rate, and so forth. You DO NOT have to reload your data to make
yours changes in the Timber setting file effective (although you will have to exit the Timber executable, and restart it).
The table below shows the main content of
the Timer settings file, with details about the meaning of each parameter. Typical
cases requiring changes to the default configuration can be found in the FAQ
page.
By default, this configuration file is located at <basedir>/Timber/bin/Resource/timber.settings.
#============================== # Timber configuration #============================== # #------------------------------ # Global/Main #------------------------------ # The destination shore configuration file, relative path from running directory SHORE_CONFIG_FILE = ../Resource/example.conf #------------------------------ # MSXMLParser #------------------------------ # Max depth of xml documents MAXDEPTH = 100 # Max record length for node record MAX_RECORDLENGTH = 8192 #------------------------------ # PhysicalDataMng #------------------------------ # No Runtime Settings #------------------------------ # IndexManager #------------------------------ # file path to all gist indices, it is a relative path # with respect to where Timber is run GIST_DEFAULT_FILEPATH = ../volumes/ #------------------------------ # DataMng #------------------------------ # No Runtime Settings # The following can be adjusted at compile time # DYNAMICFILETABLE_LIST_SIZE = 16 # MAXSCAN_NUMBER = 100 #------------------------------ # NodeIDMap #------------------------------ DEFAULT_NODE_NUM_CAPACITY = 100000 DEFAULT_SIZE_CAPACITY = 100000000 # Means 30% DEFAULT_LRU_REMOVE_RATE = 0.3 #------------------------------ # Evaluator #------------------------------ #this is the container size used in structural joins when a specific #ancestor has lots of descendants, they are kept in a shore file #with each record size almost PAGESIZE. #it is declared in indexMng but it is used in Evaluator #used in EvaluatorClass.cpp and ContainerClass.cpp PAGESIZE = 8000 #this is the maximum number of nodes in a construct line. I.e. the #the maximum number of nodes in a RETURN clause. An element node #is considered a node. An attribute is considered a node. And #content is considered a node. #used in ConstructSpecification.cpp MAX_CONSTRUCT_SPEC_NUM = 50 #this is the maximum number of predicates in a filter line. I.e. the #maximum predicates in the WHERE clause. When I say predicates I mean #those in the form, for example, $a\blah = "something" NOT $a = $b. #used in FilterCondition.cpp MAX_FILTER_PRED_NUM = 20 #this is used as the initial array size to store right input in when #using nested-loops value join. The number will keep increasing as #more right inputs are read in. The closer this number is to the actual #right input size without going over, the faster your query will run. #If the value join is passed a size other than -1, this number won't #be used. #used in ValueJoinIterator.cpp INITIAL_DEFAULT_RIGHT_ARRAY_SIZE = 1000 #this is used as the initial array size to store parts of the right #input that match a left input when using sort-merge value join. #The closer this number is to the maximum numbr of right inputs to #match a left input without going over, the faster the query will run. # The number will keep increasing as more right inputs that match a #left input are read in. #used in ValueJoinIterator.cpp INITIAL_BUFFER_SIZE = 100 #this is used as the initial array size when using in-memory Sort #(external doesn't keep inputs in memory). It is used to keep all inputs #and sort them. The closer this number is to the actual input size #without going over, the faster your query will run. The number will keep #increasing as more right inputs are read in. #used in SortIterator.cpp and ValueSortIterator.cpp INITIAL_DEFAULT_SORT_ARRAY_SIZE = 1000 #I need to figure out what this does. I totally forgot. I'll email you #about it later. #used in ExternalSortIterator.cpp INITIAL_SORT_LISTS_NUM = 100 #this is used as the initial number of nodes in the witness tree. If #you have an idea of the maximum number of nodes that will be in your # in your witness tree, pass it here. If more space is needed, it will #expand. but the closer the max number of witness tree nodes to this number #without getting over, the faster your queries will run. #used in WitnessTree.cpp INITIAL_TREE_SIZE = 16 #this is used as the initial stack size that is used in structural joins. #it corresponds to the level of nesting of the ancestor in the join. but # if you are in doubt, just pass the depth of the document here. the closer # of the document (or nesting level of ancestor) to this number, the faster #your query will run. As with other initial parameters, if the number is #smaller than expected, the size of the stack will increase. #used in stack.h INITIAL_STACK_SIZE = 25 #this is the number of ids reserved for each structural join and external sort #used in the query. each id will be given to each container written to shore #file. so, if you know the number of containers you are writing, pass it here. #passing a huge number here would be safe but shore will eventually run out of #logical ids and then you have to initialize and reload the data. MAX_ID_PER_FILE = 1000 # MLCAS initial default depth INITIAL_DEFAULT_DEPTH = 16 #------------------------------ # Optimizer #------------------------------ #------------------------------ # TextMng #------------------------------ # Flag for keyword-only inverted indexing USE_KEYWORDS = false #------------------------------ # Multicolor Environment #------------------------------ # whether you want to run in Multicolor environment MULTICOLOR = false # Allowance of id assignment for Multicolor MCT_ALLOWANCE = 2000 # Level cutting for Multicolor MCT_CUTTING = 2 # Multicolor special directive # Special flag processing for billing & shipping address dual color MCT_SPECIAL = false |
Last
Updated; 07/20/2004