Kweelt Big Logo

Querying XML in the New Millennium

Back to project main page | Back to project home
The site is still under construction.

Table of Contents


What is Kweelt
 [top]

Kweelt is a framework to query XML data. Among other things, it offers an evaluation engine for the Quilt XML query language proposed by Chamberlin, Florescu and Robbie, with a lot of useful extensions. Using Kweelt, one can run all the use-cases published by W3C for the XML query requirements.
But Kweelt is not just an implementation of a query language. It has been designed as a reference platform to make all sort of research experiments related to XML: storage, query optimization, document output, etc.

Some important features of Kweelt are:

  1. Kweelt implements a query language for XML that satisfies all the requirements from the W3C query-language-requirements.
  2. Kweelt offers multiple XML back-ends. The query evaluator does not impose any specific storage for XML but relies on a set of interfaces (Node and NodeList) implemented by a NodeFactory. Kweelt comes with a couple of built-in node factories (DOM, DOM+, SAX, Wizdom), but the user can easily provide his/her own.
  3. Kweelt is open-source (GPL) to allow people to make more radical changes to the framework itself.
  4. Kweelt is fully written in Java 1.2, can run an any Java platform and has a small footprint.
  5. Kweelt is extensible. The user can create his/her own user-defined functions (UDF) and make them available inside the query. Kweelt provides various template classes to make the creation of such functions very easy.
  6. Kweelt comes with the Kweelt Server Pages (KSP) extension, a built-in Cocoon processor. KSP allows to embed Kweelt queries inside any XML page serviced by Cocoon.
  7. Kweelt comes with numerous working examples.

Who should use Kweelt
 [top]

The short answer is everybody. More seriously here are some domains that could take advantage of Kweelt:


Kweelt vs Quilt
 [top]

Kweelt is a "personal" adaptation of the Quilt proposal where some features have been removed, some have been adapted, other have been added. Kweelt does not claim to be faithful to the Quilt proposal. It does try to offer an intuitive, powerful and extensible syntax to query (navigate, extract, compose, reconstruct) XML documents.


Added Features

In-Lined XML
Description In-lined XML is a way to embed some small XML pieces inside a query itself. The syntax is very similar to the XML CDATA: XML data is embedded inside Kweelt by using outer [[ and ]]. The string is simply sent to DOM parser to produce regular XML nodes.
RationaleSince XML offers a nice syntax to describe any kind of data, this is in particular adequate to describe some contant values that can be used as part of a query. Check the portfolio query for an example.
Java external functions
Description

Users can write their own Java code (functions) and call it from the query. Such functions have to implement the right interface or extend some template (abstract) classes and must be registered inside the query using the import ... as ...; construct.

Check Q9 for an example.

RationaleTo be able to extend the language without touching the implementation itself.
Typed references (IDREFs)
Description

Quilt has introduced the "arrow operator" that permits to dereference an IDREF attribute and get to the corresponding element(s) (pointed by the value of the IDREF). For instance, in Quilt the construct @spouse->/@name will return the list of attribute nodes name for elements with an id attribute equal to the value of attribute spouse.

In Kweelt, the user user can give some hints to the query processor. A hint consists of a pair (element-tag-name, attribute-name) and tells Kweelt that the reference points to an element with a given element-tag-name, which the id attribute is defined by attribute-name. The Kweelt syntax is: @spouse->{Person@pid, Student@sid}/@name. This is to be understood as follows: grab the Person elements for which attribute pid is equal to the value of attribute spouse or the Student elements for which attribute sid is equal to the value of attribute spouse.

Check the queries from the REF use-cases for some examples.

Rationale

First, it is worth keeping in mind that the arrow operator is just syntactic sugar and can be described as a regular join.
The problem is that the arrow operator (just like the id function is XPath) requires some structural information (DTD, Schema, etc.) about the XML document. Without such information, the operator has no meaning because there is no way to guess which attribute has to be considered as an ID. Moreover, even in the presence of structural information, following a reference means to look at ALL elements and find out if their ID attribute is equal to the value.

In Kweelt, we do not assume that every document comes with some structural information. Therefore the user is strongly encouraged to provide the piece of information necessary to give a meaning to the query. A hint is a way to tell Kweelt which elements to look for, by somehow typing the reference. This also makes the query easier to read.

We think that this addition is useful for various reasons.
  • it is optional and can be added by hand by the user
  • it gives more readability to queries by identifing the type of the element that gets pointed to
  • it does not force Kweelt to look for the DTD/Schema
  • it is compatible with any kind of structural description (DTD, Schema, etc.)
  • one can easily imagine a preprocessing step that would retrieve the DTD/Schema of the documents used in the query and adorn the Kweelt query with such hints
When there is no hint, our implementation will assume that the name of the attribute ID is either "ID" or "id". It turns out that most users are using either "ID" or "id" as the attribute name for the attribute of type ID. Check the Torquemada project for more info.

Modified Features

Syntax of arithmetic comparison operators
Description Arithmetic comparison operators have to be escaped inside Kweelt. We offer for each operator two alternatives as defined in the table below:
<escaped asLT or .<.
<=escaped asLEQ or .<=.
>escaped asGT or .>.
>=escaped asGEQ or .>=.

Check Q8 from the R use-cases for an example.

RationaleThe arithmetic comparison operators can be confused with XML mark-up, which makes parsing ambiguous.
FILTER operator
Description The FILTER operator retains only those nodes that individually satisfy the path expression, and does not retain their descendant nodes unless those nodes satisfy the path expression also. The right-hand of the FILTER operator is now a predicate (i.e. an expression that evaluates to true or false).
Instead of writing

document("cookbook.xml") FILTER //section | //section/title | //section/title/text()
one needs to write

document("cookbook.xml") FILTER ( section OR title[ancestor::section] OR text()[ancestor::title/ancestor::section] )
Check queries Q1, Q11 and Q12 from the TREE use-cases.
Rationale The definition of the semantics of the FILTER operator implies that the right-hand side is a predicate. This also makes the implementation easier.

Removed Features

LET .. EVAL
Description ??
Rationale Why do we need it? LET .. RETURN seems to be enough.

Features Not Yet Supported

XML comments and PI creation
Description A Kweelt query cannot create comment or processing-instructions.
Rationale None. Laziness. But it is trivial to do.
Namespaces
Description Kweelt ignores namespaces. xhtml:html is considered as a regular tag name and the inner tags do not inherit the xhtml namespace.
Rationale None.

License
 [top]

Kweelt is open-source and made available through the GNU General Public License.

If you think you need another type of license, contact me.


Download
 [top]

The Kweelt distribution can be downloaded from this page. You can choose between various formats (zip or tar.gz). By using the URL, you accept the conditions of the license agreement.

For some other packages needed by the distribution, we provide below some pointers and - when it is legally possible - a copy of the jar file. The copy is not always the latest version. You are encouraged to download from the official site in order to get the latest version, the release notes and the documentation.


Java 1.2

IBM Java 2 v1.3 for Linux official download
IBM Java 2 v1.3 for Windows (WebSphere) official download
Javasoft Java 2 Standard Edition official download

XML and related libraries

Apache Xerces parser official download unofficial download
Apache Jakarta Regexp official download unofficial download
Apache Cocoon official download unofficial download
Oracle parser official download I am not allowed to re-export their software
Sun parser official download I am not allowed to re-export their software
AElfred SAX parser official download unofficial download

Installation
 [top]

Depending on what one wants to do, the set-up might be slightly different. You can use your favorite Java 1.2 implementation. We recommend the one from IBM. The default NodeFactory is based on Apache Xerces.
For the rest of the section, we will use in the definition of the CLASSPATH the name of the jar file. For the actual configuration one will have to prodide the full path instead of just the name of the jar file.

kweelt.jarthe Kweelt package
rt.jarthe Java 1.2 runtime classes
xerces.jarApache DOM/SAX package
regex.jarApache regular expression package

Simple set-up (query parsing only)

export CLASSPATH=rt.jar:kweelt.jar (Unix/bash)
set CLASSPATH=rt.jar;kweelt.jar (Win)


Simple set-up (query parsing + query evaluation)

export CLASSPATH=rt.jar:kweelt.jar:xerces.jar (Unix/bash)
set CLASSPATH=rt.jar;kweelt.jar;xerces.jar (Win)

If you decide to use user-defined functions, make sure that you also include them (as a jar file or as a directory) in your classpath.


Recommended set-up

export CLASSPATH=rt.jar:kweelt.jar:xerces.jar:regex.jar (Unix/bash)
set CLASSPATH=rt.jar;kweelt.jar;xerces.jar;regex.jar (Win)


KSP set-up (the whole enchilada)

If you want to use Kweelt as part of Cocoon, you first need to have Cocoon up and running. Check the Cocoon Web site for some installation information.
Once Cocoon is up-and-running, you need to do two things: (1) tell the servlet engine that Cocoon uses where the Java classes needed by Kweelt are located; (2) register Kweelt as a Cocoon processor.

If you use JServ as your servlet engine, you need to add the following lines in the jserv.properties file:
jserv.properties:wrapper.classpath=rt.jar
jserv.properties:wrapper.classpath=kweelt.jar
jserv.properties:wrapper.classpath=regex.jar

Also make sure that JServ will use a Java 1.2 VM, by having the following line in the jserv.properties file:
wrapper.bin=path_to_java_1.2/bin/java

To register Kweelt in Cocoon, simply add the following line in the cocoon.properties file:
processor.type.kweelt = xacute.ksp.KweeltProcessor


Testing for the installation

To test for the installation, you simply need to run Kweelt using the -v switch to get the version number.
java -classpath $CLASSPATH xacute.quilt.Main -v (Unix/bash)
java -classpath %CLASSPATH% xacute.quilt.Main -v (Win)

You should get the following answer:


[sahuguet@isis]$ java xacute.quilt.Main -v +---------------------------------------------+ | Kweelt version 1.04 / Bergerac / 2000-09-01 | | Copyright 2000, Arnaud Sahuguet | | MAIL: Arnaud.Sahuguet@polytechnique.org | | URL: http://db.cis.upenn.edu/Kweelt | +---------------------------------------------+ [sahuguet@isis]$
Running Kweelt with the -v switch

This is not tremendously informative, but it proves that the Kweelt classes are properly installed. The Kweelt command-line provides a way to check which third-party packages that might be needed are available. To check for that, simply run Kweelt with the --check-classpath switch.


[sahuguet@isis]$ java xacute.quilt.Main --check-classpath Looking for available packages: (*) package Java 2 (mandatory) FOUND (*) package W3C DOM (mandatory) FOUND (*) package SAX (mandatory) FOUND (*) package Xerces DOM/SAX parsers (mandatory) FOUND (*) package Sun Project X DOM/SAX parsers (optional) NOT FOUND (*) package Oracle DOM parser (optional) NOT FOUND (*) package Apache Regular Expression (optional) FOUND [sahuguet@isis]$
Running Kweelt to test for installed packages

This way you can check which packages are available in the CLASSPATH you are using to run Kweelt.


Running some examples

The Kweelt distribution comes with numerous examples located in the useCases folder. We provide examples from the W3C use-cases and other ones that illustrate other features of the framework.

XMP Use Case "XMP": Experiences and Exemplars
TREE Use Case "TREE": Queries that preserve hierarchy
SEQ Use Case "SEQ" - Queries based on Sequence
R Use Case "R" - Access to Relational Data
SGML Use Case "SGML": Standard Generalized Markup Language
TEXT Use Case "TEXT": Full-text Search
NS Use Case "NS" - Queries Using Namespaces
PARTS Use Case "PARTS" - Recursive Parts Explosion
REF Use Case "REF" - Queries based on References
AdvancedExamples Advanced Examples

Usually query examples require some local files. Make sure that you run the query from the correct folder, or modify the query to point to the right document.


Recompiling the source

To recompile the source code, you need: (1) to adapt the rules.mk file by providing the right location for Java libraries and runtime; (2) set the environment variable TOP to the src folder. The easiest way is to go to the src directory (where the gnu and xacute folders are located) and type export TOP=`pwd` if you are using bash or setenv TOP `pwd` if you are using tcsh.
Then run the makefile (from any location inside the source tree) by typing make -e (the -e asks make to use environment variables).

If you are using Windows, the best way is to install Cygwin, a Unix shell that runs under Windows.


Kweelt architecture
 [top]


Global Design

The design of Kweelt has been guided by modularity. The core of the system is the xacute.quilt package that contains - among other things - the query parser and the query evaluator.
The xacute.quilt package relies on interfaces and constants defined in the xacute.common package. The xacute.quilt package is not bound to any XML related implementation. Everthing is defined in terms of xacute.common.Node and xacute.common.NodeList. Access to the implementation of Node and NodeList is performed via the xacute.common.NodeFactory.

Some implementations for interfaces xacute.common.Node, xacute.common.NodeList are located in the xacute.impl package. xacute.impl.dom offers an implementation based on the DOM interface. xacute.impl.xdom offers an implementation based on the Xerces DOM implementation where some methods not provided as part of the DOM interface are being used for better performance. Implementations rely on default implementations (abstract classes) from the xacute.util package. Should you want to write your own implementation for these interfaces, you are strongly encouraged to subclass the abstract classes from xacute.util. Package xacute.util also offers various utility functions used by the Kweelt framework.

The Kweelt evaluation engine is organised in terms of expressions, the top-level class being QuiltExpression. Every QuiltExpression must implement the method eval(EvalContext con) that returns a xacute.quilt.Value.
xacute.quilt.Value is an interface for the Kweelt base types: ValueBool, ValueString, ValueNum (every number is a represented as float), ValueNode and ValueNodeList. ValueNode and ValueNodeList are just wrapping classes for xacute.common.Node and xacute.common.NodeList.
EvalContext is a class used to carry the evaluation context along the various steps of the evaluation of a query. Among other things, it contains: (1) XPath-related information: the current node, the position of the current node in the current nodelist and the size of the nodelist; (2) variable bindings for LET and FOR; (3) bindings for user-defined functions; (4) information about the NodeFactory to be used for node creation.

In most cases, Kweelt does have to worry about nodes. The node management is delegated to the NodeFactory. Nodes are created by a specific implementation and Kweelt does not have to know anything about that. Nodes are created usually by parsing directly an XML file and building an in-memory representation of it, and are manipulated via the interfaces we mentioned above.
The problems come when the query requires to create some new nodes, nodes that do not belong to any physical XML document. This happens when the query creates some XML elements for instance, when the query uses operators like SHALLOW or FILTER. In this case, we need to have a way to create new nodes. We could require the NodeFactory to provide ways to create new nodes. But what if the NodeFactory relies on SAX-based back-end? How do we create nodes? What if the NodeFactory relies on a SQL back-end? Do we need to make a SQL insert for every new node we want to create.
The solution we adopt is to have Kweelt take care of the new nodes. Hopefully, most of the nodes will be physical nodes and the number of new will be limited. Kweelt defines classes for text nodes (xacute.util.QuiltTextNode), element nodes (xacute.util.QuiltElementNode), attribute nodes (xacute.util.QuiltAttributeNode) and shallow nodes (xacute.util.QuiltShallowNode).
This solves a lot of headaches and make the NodeFactory interface very simple (basically, it is read only). On the other hand, we need to make sure that these new behave like real nodes and that, in particular, offer the same navigation capabilities along the various XPath axis. This is crucial when queries are nested and the result of the first one is used by the second.

Another issue is whether to copy nodes or not. When Kweelt evaluates a query and that a given node is going to be part of the result, should the node be copied and added to the result or should a pointer only be added to the result. The result of a query is in most cases a nodelist of pointers to physical nodes. We have decided to use pointers whenever possible to save on resources. This is in particular the case when the SHALLOW operator is used: instead of duplicating the node, we use a QuiltShallowNode which is simply a wrapper on top of a regular Node, where the node behavior has been modified to reflect the semantics of SHALLOW.
When we need to produce (output) the result, we simply navigate the structure by following pointers. This way, we can take advantage of the underlying XML back-end. In the current implementation, the output of a result uses SAX event, which means that the result is NEVER really materialized. An extreme case we plan to investigate is when the documents we are handling are huge and stored in a database back-end. Assuming the result is small (in terms of number of pointers), Kweelt should be able to evaluate and output the query with minimal resources, all the hard work being performed by the database back-end.


Core Architecture

Kweelt Core Architecture
Kweelt Core Architecture

Supported back-ends

Name Type Classname (xacute.impl) Status
Xerces parser DOM .dom.NodeFactoryXerces operational
Xerces parser DOM + node index .xdom.NodeFactoryXerces operational
Oracle parser DOM .dom.NodeFactoryOracle operational
Sun parser DOM .dom.NodeFactorySun operational
SAX-LD SAX + naive linear array-based storage .ldtree.NodeFactory almost ready for prime-time
Wizdom binary file storage .wizdom.NodeFactory almost ready for prime-time
mySQL relational back-end .sql.NodeFactory under development (David White)

Cocoon-based Deployment Architecture

Kweelt: Cocoon-based Deployment Architecture
Kweelt: Cocoon-based Deployment Architecture

Extending the Kweelt Framework
 [top]


Wrtiting your own top-level class

Executing a query is a very simple process: (1) create an instance of the parser; (2) parse the query into a QuiltQuery object; (3) evaluate it.
The evaluation requires an EvalContext. The role of the EvalContext is to provide information about the NodeFactory. The result of the evaluation always produces some SAX events and therefore the eval method requires a SAX DocumentHandler as an argument.

Kweelt provides various DocumentHandler in the xacute.util package, such as: ToStringHandler to produce a String; OutputHandler to send the result on some output stream; HashHandler to produce a MD5 hash of the XML output; DevNullHandler to do nothing.


String s; QuiltParser parser = new QuiltParser(); QuiltQuery query = parser.parseQuery(s); EvalContext con = new EvalContext(); con.setNodeFactory( new MyNodeFactory() ); DocumentHandler handler = new MyOutputHandler(); query.eval(handler, con);
Using the Kweelt API

Writing your own Java functions

You can enrich Kweelt with some arbitrary functions defined as Java classes. To do so, you must: (1) write your function by subclassing one of the function template classes; (2) compile it and make sure that it is in a location reachable by the Java classpath; (3) import the function is the Kweelt query using the import keyword.

An example of a query that uses imported functions can be found in the UDF directory.


Writing your own NodeFactory

Writing your own NodeFactory is a different kind of undertaking and you should think twice before doing it. The NodeFactory is in charge of: (1) parsing XML, (2) instanciating nodes that support the Kweelt Node interface and (3) offering Node and NodeList primitives.

Kweelt comes with default implementations (in the form of abstract classes) for NodeFactory, NodeImpl and NodeListImpl. By subclassing them - altogether or separately - the extra work you need to provide is small.

If you feel the need to write your own implementation, you will have to provide the implementation for a class that implements the NodeFactory interface. Since NodeFactory use both NodeImpl and NodeListImpl, you will have to provide some implementation for them too.


JavaDoc
 [top]

The JavaDoc documentation is available from here.


KSP
 [top]

KSP stands for Kweelt Server Pages. KSPs are a way to embed Kweelt queries into XML pages (in the spirit of server-side-includes, active-server-pages, etc.). Instead of re-inventing the wheel, we provide KSP as a special Cocoon processor. This way, you can take advantage of all the nice features of Cocoon and use Kweelt as another way to produce content. Moreover, the architecture of Cocoon permits to postprocess the output of KSP using other Cocoon processors (i.e. XSLT).

Maiking Kweelt a new Cocoon processor is only a few lines of Java code (thanks to the modular design of Cocoon, I guess). Feel free to modify the code to meet your specific needs.

You can embed as many queries as you want per page. A query is introduced using an outer kweelt tag and an inner kweelt-query tag. The easiest way is to write the Kweelt query as a CDATA node, to avoid the burden of escaping characters. An example is provided below.


<?xml version="1.0"?> <?cocoon-process type="kweelt"?> [...] <kweelt> <kweelt-query> <![CDATA[ put your query here ]]> </kweelt-query> </kweelt> [...]
KSP: the input

The first processing-instruction tells Cocoon how to process the XML page (the XML document is directly sent to the Kweelt processor). The Kweelt processor will extract the information inside the kweelt tag and will replace the subtree rooted at kweelt-query by a new subtree rooted at kweelt-result and containing the result of the query inside tag data and some other meta-information available for further processing, such parsing and execution time, etc.


<?xml version="1.0"?> [...] <kweelt> <kweelt-result> <data> the result of the evaluation of the Kweely query </data> </kweelt-result> </kweelt> [...]
KSP: the output

It is often useful to be able to pass parameters to the KSP. For instance, one can write a KSP to produce the list of publication for a given author. The template is the same for every author: the only difference is the name of the author. KSP supports parameterized queries. In the query itself, parameters are introduced using {@var}. Parameters are transmitted to the KSP via the HTTP QUERY_STRING.
For instance the following page (say publicationBy.xml) should be called as publicationBy.xml?author=Arnaud&after=1995. The Kweelt processor will extract the values of the paramters and replace them in the query itself: {@author} will be replaced with arnaud and {@after} will be replace with 1995.


<?xml version="1.0"?> <?xml-stylesheet href="my-stylesheet.xsl" type="text/xsl"?> <?cocoon-process type="kweelt"?> <?cocoon-process type="xslt"?> <doc> <H1>First Query</H1> <kweelt> <kweelt-query> <![CDATA[ <P> <UL> LET $items := document("bib.xml")/*[@year .>=. {@after}][CONTAINS(author, "{@author}")][count(author)=1] FOR $item IN $items RETURN <LI>$item/title/text(), " ", <U>$item/@year</U></LI> SORTBY (./U DESCENDING) </UL>, CONCAT( " ", NUMFORMAT("##", count($items)), " items found.") </P> ]]> </kweelt-query> </kweelt> </doc>
KSP using parameters

The {@var} syntax is only recognized by KSP and should be viewed as a pre-processing instruction. This is NOT part of the Kweelt query language itself.

To wrap-up, the DTD used for the Kweelt Server Pages is defined below:


<!ELEMENT kweelt (kweelt-query|keelt-result)> <!ELEMENT kweelt-query (#PCDATA)> <!ELEMENT kweelt-result (data, parsingLog, ExecLog) > <!ATTLIST kweelt-result parsing "OK"|"FAIL" > <!ATTLIST kweelt-result parsingTime CDATA > <!ATTLIST kweelt-result execution "OK"|"FAIL" > <!ATTLIST kweelt-result executionTime CDATA > <!ELEMENT parsingLog (#PCDATA)> <!ELEMENT ExecLog (#PCDATA)> <!ELEMENT data ANY>
Kweelt DTD

The page generated by the Kweelt processor can be further processed using XSLT for instance. The distribtion provides various examples.


Test Suite
 [top]

Kweelt comes with a test suite borrowed from the W3C use-cases. The examples from the test suite can be used to validate the parser and validate the evaluation engine.

To make the testing easier, we provide two bash scripts test-parser.sh and test-engine.sh. They will need to be configured by the user to specify the location of the Java runtime and the Java libraries.
To use them, you simply need to go into one of the use-case directories and run the program. The script will look for queries and either parse them or evaluate them and compare the output with the published result. Feel free to change the scripts.

test-parser simply calls the Kweelt parser for given query (Quilt queries use the .qlt extension) and expects a status code equal to 0 (the program should exit with exit value 0).

test-engine is more complex. It needs to evaluate the query and compare it with the result published in the use-cases. In order to compare two XML documents, we compute the hash (MD5) of a document. Basically, we navigate the XML tree and hash all the nodes. Attribute nodes are sorted in lexicographic order. Text nodes are trimmed. The code of the handler can be found here.


Call for contribution
 [top]

In the spirit of the open-source movement, everyone is encouraged to contribute to the project. As mentioned above Kweelt has been designed in a modular way to make extensions easy. Here are some ideas for contributions:


Resources
 [top]


FAQ
 [top]

Q:Why does Kweelt require Java 1.2
A:

Because I am lazy. The only feature of Java 1.2 that Kweelt makes use of is collection API and the sort method it provides. So, by simply rewritting the code for sort, one could make it work under Java 1.1. Java 1.2 also offers some nice features with collections that do not have to incur the overhead of synchrnonized. Since Kweelt needs to create many lists, there is a potential benefit in using 1.2.


Q:What does Kweelt stand for
A:

Nothing. It just sounds like "Quilt".

After further inquiry (thanks to Frank N.), it turns out that 'kweelen' is a dutch verb with two meanings (kweelt then is 3rd person singular, like in 'he kweelt'):

1 - to sing in a beloving way; this is a word mostly used by poets,
normal people :-) use it in an ironic way, if I would say "he *kweelt*
a song" then it would be a hint that he sings very badly
2- another meaning is to suffer, but also this is not really used by
normal people.

Q:When I run a query I do not always get the same result in terms of order
A:

Order is major issue for XML query languages.

First the notion of order is not clearly defined. For instance, when you union nodes from two different documents, what is the order? When you union attribute nodes, what is the order?

Second, DOM does not provide a good way to compare two nodes. The only way is to grab for both nodes their ancestors, reach the first common node and from there determine the order by comparing child node indices. The big problem is that this is a very expensive (in terms of the depth of the document tree) operation.
The good news is that xacute.impl.xdom.NodeFactoryXerces takes advantage of some internals of the Xerces implementation that provides a comparison function between nodes for free (comparing two int). But this is not DOM anymore.

The Kweelt strategy is ALWAYS to try to perform an ordered union. When it cannot (because two nodes cannot be compared which triggers an exception), Kweelt performs an un-ordered union (aka append). Hopefully, you get the best result available for every query. Using an ordered vs.un-ordered union also has some consequences in terms of performance, especially if you have to use DOM (pure DOM) to compare nodes.


Q:When I use SHALLOW, FILTER or when I nest RETURN statements, I get some funny answers
A: The problem comes from the difference between physical and new nodes. Physical nodes are nodes that exist in a physical XML document and come from the parsing of this document. Their behavior is managed by the NodeFactory and its underlying XML back-end. New nodes are nodes that are freshly created out of the query via element construct (XML tags you use in the RETURN clause) and the SHALLOW and FILTER operators. If you think about it, these nodes do not exist anywhere and we need to create them. The behavior of these new nodes is defined in the xacute.util.Quilt_xxxx_Node classes and not all the navigation primitives have been implemented consistently. THIS IS A SHORTCOMING OF KWEELT THAT WILL BE FIXED IN THE FUTURE.
Q:How can I stay informed?
A:A mailing list should be set-up really soon. Meanwhile, simply vivist the Kweelt webpage an a regular basis.
Q:How can I contribute?
A:There are many ways to contribute to the project. Using Kweelt is already a good way to contribute. Building applications that rely on Kweelt is a good way to identify shortcomings and think about contributions.

Competition
 [top]


Querying XML in the previous millenium
 [top]


To-do list
 [top]


Credits
 [top]

Kweelt is the result of the joint work of 2 people (Laurent and Arnaud) for a few weeks during the summer of 2000, in the Database Research Group at the University of Pennsylvania. In parallel, some work was being done (Thien-Loc) on wizdom, a file-based XML back-end to be used inside Kweelt. The Kweelt work ended in a prototype running most of the use-cases.
Since then, the prototype has been completely rewritten (Arnaud) to be modular, run all the use-cases, extend the language, provide KSP, etc.

Basically, these guys are the French Connection, also known as the X-Men (to be pronounced "icks-men"). See below:


Ecole Polytechnique (aka l'X)

Kweelt in the news (coming soon :-)
 [top]


Bug Report
 [top]

In order to make the fixing of bug easier and faster, a bug submitter is encourged to provide the following information:

If you can, do not submit a complex example. Try to identify which element of the query creates the bug.

Bug reports can be submitted here.


Artwork
 [top]

Kweelt Logo
The Official Kweelt Logo

Index of Directories
 [top]

You can browse the distribution on-line if you will. The filestructure is available below.
If you just want to download some stuff, go to the download directory.

Project hosted by SourceForge Logo

Page created and maintained by Arnaud Sahuguet.
Last update: 18-Sep-2000.