Latest Updates


1.       Definition of Predicates
A semantic predicate is a way to enforce extra (semantic) rules upon grammar actions using plain code.
Upon syntactic non determinism, use semantic predicates to resolve if n-1 predicates available for n alternatives (don’t hoist for deterministic alts)
        Nth predicate is simply!(p1||p2||..||pn-1)
        a : {p1}? A | {p2}? A | A ;
Predicates encountered during DFA construction combined
        {p1}? Followed by {p2}? => p1&&p2
        {p1}? | {p2}? => p1||p2
Predicates beyond hoisting depth 0 ignored
a: A {p1}? | A {p2}? ;
Predicates are both disambiguating and validating
Predicates executed for correct syntactic context
All paths matching ambiguous input must be covered with a predicate expression
Semantic predicates
  • Specifies the semantic validity of applying an alternative
  • Are Boolean expressions in the target language that are evaluated at runtime
  • Used to guide recognition
  • Enforce non-syntactic rules (rules for which mere syntax is not enough to describe a valid sentence)
  • Help to recognize context-sensitive language constructs
  • Are "hoisted" up into other rules when necessary to inform the decision making process
  • Must be free of side effects
    • Must be possible to repeatedly evaluate them and get the same result
    • Evaluation order must not matter
  • Should not reference local variables or parameters (because this would break hoisting)

2.       Purpose of Predicates

Normal user code should deal with put_attr/3, get_attr/3 and del_attr/2. The routines in this section fetch or set the entire attribute list of a variable. Use of these predicates is anticipated to be restricted to printing and other special purpose operations.
get_attrs(+Var, -Attributes)
Get all attributes of Var. Attributes is a term of the form att(Module, Value, MoreAttributes), where MoreAttributes is [] for the last attribute.
put_attrs(+Var, -Attributes)
Set all attributes of Var. See get_attrs/2 for a description of Attributes.
If Var is an attributed variable, delete all its attributes. In all other cases, this predicate succeeds without side-effects.

3.       Types of Predicates

A.      Validating semantic predicates
  • Look like normal actions but are followed immediately by a question mark {..}?
  • ... is a Boolean expression written in the target language; if it evaluates to false then a Failed Predicate Exception is thrown
B.       Gated semantic predicates
  • Precede alternatives and are written as {...}?=> where ... is a Boolean expression written in the target language
  • When evaluating to false, effectively disable the alternative, making it invisible to the recognizer (no exception is thrown)
  • Put another way, they dynamically "turn on" or "turn off" portions of a grammar
  • Always hoisted, even when decisions are deterministic
  • Useful when you want to distinguish between alternatives that are not syntactically ambiguous
  • May only appear in rules that actually have multiple alternatives
  • May be used in laxer and parser rules
C.      Disambiguating semantic predicates
  • Precede alternatives and are written as {...}?
  • Used in making prediction decisions
  • Are only used (evaluated) when syntax alone is insufficient to distinguish between alternatives
  • Put another way, they disambiguate syntactically identical alternatives
  • Unlike gated semantic predicates may be used in rules which have only one alternative
  • Are hoisted into rules higher up in the decision chain when LL(*) look ahead alone is not sufficient to distinguish between alternatives
  • Unlike gated semantic predicates, are not hoisted for deterministic decisions
  • Only predicates reachable from the left edge without consuming an input symbol are hoisted
  • To fully resolve a non-determinism, all alternatives must be covered by a disambiguating predicate
  • As a special case, if the last (and only the last) conflicting alternative is not covered then ANLTR implicitly covers it as a "default"
  • Predicated alternatives specified earlier have precedence over those specified later; that is, the semantic predicates are evaluated in the order specified in the alternatives in the grammar
  • Multiple predicates may be specified for a single alternative:
  4.       Example

The position of a predicate within a production determines which type of predicate it is. For example, consider the following validating predicate (which appear at any non-left-edge position) that ensures an identifier is semantically a type name:
decl: "var" ID ":" t:ID
      { isTypeName(t.getText()) }?
Validating predicates generate parser exceptions when they fail. The thrown exception is is of type SemanticException. You can catch this and other parser exceptions in an exception handler.
Disambiguating predicates are always the first element in a production because they cannot be hoisted over actions, token, or rule references. For example, the first production of the following rule has a disambiguating predicate that would be hoisted into the prediction expression for the first alternative:

stat:   // declaration "type varName;"
        {isTypeName(LT(1))}? ID ID ";"
    |   ID "=" expr ";"            // assignment
If we restrict this grammar to LL(1), it is syntactically nondeterministic because of the common left-prefix: ID. However, the semantic predicate correctly provides additional information that disambiguates the parsing decision. The parsing logic would be:
if ( LA(1)==ID && isTypeName(LT(1)) ) {
    match production one
else if ( LA(1)==ID ) {
    match production one
else error   
Formally, in PCCTS 1.xx, semantic predicates represented the semantic context of a production. As such, the semantic AND syntactic context (lookahead) could be hoisted into other rules. In ANTLR, predicates are not hoisted outside of their enclosing rule. Consequently, rules such as:
type : {isType(t)}? ID ;
are meaningless. On the other hand, this "semantic context" feature caused considerable confusion to many PCCTS 1.xx folks.
5.       Conclusion

We have presented the construction of a semantic predication gold standard from biomedical literature text using the conceptual annotation paradigm. Manual conceptual annotation is considered extremely challenging, and our results confirm this perception, while also confirming that reasonable interannotator agreement could be achieved iteratively, consistent with the findings of Bada et al. While the domain knowledge we used (UMLS) reflects the application-specific aspect of our annotation, we believe that our analysis and discussion provide important insights for future efforts in this area.
The resulting gold standard constitutes the first resource, to our knowledge, in the biomedical domain that incorporates conceptual annotation of semantic relations in a wide variety of sub domains. Two sets of annotations and the adjudicated gold standard are made publicly available for research purposes. A UMLS license is required. The corpus size is relatively small and may be insufficient for training information extraction systems. However, we believe it can serve as a benchmark to evaluate independently developed systems based on UMLS knowledge sources. Our goal is to use it for this particular purpose, as well as to guide future system development.


Poskan Komentar

Catatan: Hanya anggota dari blog ini yang dapat mengirim komentar.