Tuesday, February 15, 2011

Overview of Oracle Database Semantic Technologies

Oracle 11g is the leading database with native RDF /OWL  semantics capability and is well positioned for supporting Semantic-based application (for example, Social Network Analysis or Text Mining). For different use cases, read slide 8 and 9 on this presentation. Oracle 11g provides the following capabilities:

  • Can readily scale to ultra-large repositories (e.g., up to 10 billion)
  • Has growing ecosystem of 3rd party tool partners (see slide 11 on this presentation)
  • Combines SQL query of relational data with RDF graphs and ontologies
  • Leverages Oracle Partitioning and Page compression
  • Supports RAC and Exadata platforms

It also enables you to:

  • Store semantic data and ontologies
  • Query semantic data
  • Perform ontology-assisted query of enterprise relational data
  • Use supplied or user-defined inferencing to expand the power of querying on semantic data

The Figure below shows how these capabilities interact.

The RDF triples, the basic data model for semantic web, is what makes semantic technologies different from relational technologies. It's self-describing . There is no need for schema. The schema is built into the triples itself. All triples are parsed and stored in the system as entries in tables under the MDSYS schema in Oracle. Note that duplicate triples are not stored in the database. But, canonically equivalent text values having different lexical representations are stored in the database.

RDF Triples

RDF triples is used to create a knowledgebase (vs. a database). It is defined as:

  • Things in the world are resources like:
    Employee, Manager, Department
  • Resources have properties (they are resources too) such as:
    first_name, employee_id, salary
  • Properties have values like:
    "John", "16530", "80,000"

A resource-property-value statement (or subject-predicate-object) is called a triple. We can generate triples from either underlying relational database data or other sources.

Subject PredicateObject
Employee16530 employee_id "16530"
Employee16530 first_name "John"
Employee16530 salary "80,000"

Triples can be linked to form a graph that describes concepts (for example, Person and Organization). Properties also link resources together (for example, "works_for" in the figure).

This graphical data model represents the schema which can dynamically evolve over time. For example, after appending Dept20, the new graph looks like:

Inferencing Based on Transitivity

Inferencing is the ability to make logical deductions based on rules. Inferencing enables you to construct queries that perform semantic matching based on meaningful relationships among pieces of data, as opposed to just syntactic matching based on string or other values. Inferencing involves the use of rules, either supplied by Oracle or user-defined, placed in rulebases. The inferencing capability that Oracle 11g supports is defined by W3C:
  • Native inferencing in the database for
    • RDF, RDFS, OWL subset
    • User-defined rules
  • New relationships/triples are inferred (or entailed) and stored ahead of query time
    • Forward Chaining
    • Entailment stored persistently to minimize on-the-fly computation, thus speeding query execution
  • Automatic identification of new relationships (triples) as shown in the figure below

Vocabulary Support in Oracle 11g R2

Oracle 11g R2 supports different Domain Ontologies which are the taxonomies that represent particular vertical domains:
  • W3C Simple Knowledge Organization System (SKOS)
    • New rulebase supporting the emerging SKOS standard on RDF
    • Enables easy sharing of controlled / structured vocabularies (thesauri, taxonomies, classification schemes)
    • Enforces integrity constraints
  • Dublin Core (for media and library)
  • SNOWMED (for medical communities)
  • NCI (National Cancer Institute, Gene Ontology)
  • FOAF (Friend of a Friend)
  • GeoRSS
  • SIOC (Semantically-Interlinked Online Communities)
  • GoodRelations (eCommerce Product Ontology)
  • Others

Read More

  1. Oracle Database Semantic Technologies
  2. Oracle Semantic Technologies Downloads
  3. Oracle Semantic Technologies Inference Best Practices with RDFS/OWL
  4. Oracle Database Semantic Technologies Developer's Guide (11.2)
  5. SEM_APIS package
  6. Installation of Oracle Semantic Technologies
  7. Oracle Database Semantic Technologies: Understanding How to Install, Load, Query and Inference
  8. Oracle Database Semantic Technologies Tutorial
  9. A Scalable RDBMS-Based Inference Engine for RDFS/OWL
  10. Oracle Database Semantic Technologies - Product Performance
  11. Semantic Technologies & Triplestores for BI

    No comments: