l -> error |C
where l is a term, error is a reserved constant, and C is a (possibly empty) finite sequence containing
- membership tests of the form X \in rexp w.r.t. a given regular language rexp;
- equations/inequalities over terms.
l -> r {q}
where l and r are terms and q can be A or E. Completeness rules of a Web specification formalize the requirement that some information must be included in all or some pages of the Web site. We use attributes {A} and {E} to distinguish "universal" from "existential" rules, as explained below. Right-hand sides r of completeness rules can contain functions, which are defined in R. In addiction, some symbols in the right-hand sides of the rules may be marked by means of the symbol #. Marking information of a given rule r is used to select the subset of the Web site in order to check the condition formalized by r. Intuitively, the interpretation of a universal rule l -> r {A} (respectively, an existential rule l -> r {E}) w.r.t. a Web site \Website is as follows: if (an instance of) l is "recognized" in the Web Site, (an instance of) the irreducible form of r must also be "recognized" in all (respectively, some) of the Web pages that embed (an instance of) the marked structure of r. Let R be a TRS, we define X ++Y , which concatenates two strings; we also define X+Y, which sums two natural numbers. Let (R,IN,IM) be a Web specification where IN and IM are defined as follows:
member(name(X),surname(Y)) -> #hpage(fullname(X++Y)),status) {E}
hpage(status(professor)) -> #hpage(#status(#professor),teaching)) {A}
project(grant1(X),grant2(Y),total(Z)) -> error X+Y <> Z
This Web specification models some required properties
for a Web site of a research group.
The first rule formalizes the following property:
if there is a Web page containing a member list, then for each member, a home page should exist which contains (at least) the full name and the status of this member.
The full name is computed by applying the function ++ to the name and the surname of the member.
The marking information establishes that the property must be checked only on home pages (i.e., pages containing the tag "hpage").
The second rule states that, whenever a home page of a professor is recognized, that page must also include some teaching information. The rule is universal since it must hold for each professor home page (i.e., all the Web pages embedding the marked structure home(status(professor)).
The third rule states that, for each research project, the total project budget must be equal to the sum of the funds, which have been granted for the first and the second research periods. The membership tests in the conditions ensure that the values assigned to the variables are natural numbers.
Diagnoses are carried out by running Web specifications on Web sites.
The operational mechanism is based on a novel, flexible matching technique ABF06 that is able to "recognize" the partial structure of a term (Web template) within another and select it by computing homeomorphic embeddings of Web patterns within Web documents.
which closely resembles the notion of simulation, this relation has been widely used in a number of works about querying,
transformation, and verification of semistructured data.
Let us illustrate the concept of the Homeomorphic embedding by means of a rather intuitive example.
Consider the following XML document templates (called s1 and s2, respectively):
hpage(surname(Y),status(professor),name(X),teaching) hpage(name(mario),surname(rossi),status(professor), teaching(course(logic1),course(logic2)) hobbies(hobby(reading),hobby(gardening)))Note that s1
s2, since the
structure of s1
can be recognized inside the structure of s2, while s2 can be recognized inside the structure of s1.
p succeeds, by extending the proof, we construct the biggest substitution σ
for the variables in Var(l), such that lσ
p.
Then, depending on the nature of the Web specification rule (correction or completeness rule),
we proceed as:
(Correction rule) we evaluate the corresponding instance of the condition of the rule; a correctness error is signalled
in the case when the error condition is fulfilled.
(Completeness rule)
by a new homeomorphic embedding test, we check whether the instantiated right-hand side of the rule is recognized in some page of the considered Web site. Otherwise, a completeness error is signalled. Moreover, from the incompleteness symptom computed so far, a fixpoint computation is started in order to discover further missing information, which may involve the execution of other completeness rules.
| ELP | GPLIS | DSIC | UPV |