SlideShare a Scribd company logo
Tutorial – Semantic Digital Libraries -  Introduction  -  Sebastian R. Kruk ,  Bernhard Haslhofer,  Philipp Nußbaumer ,  Sandy Payette,  Tomasz Woroniecki
Tutorial overview Who we are Sebastian R. Kruk, DERI Galway – Ireland Bernhard Haslhofer,  University of Vienna  - Austria Phillip Nußbaumer, Research Studios - Austria Sandy Payette,  Cornell University  –  USA  Tomasz Woroniecki, DERI Galway – Ireland Today  we want to give you a brief introduction to the Semantic Web, and show how SW is related to digital libraries present existing semantic   digital library systems discuss the current problems and future directions of semantic digital libraries and get feedback from you After this tutorial you will know what is the semantic digital library system existing solutions in various degrees of detail how to run semantic digital library solutions on your machine
Tutorial   Schedule Comparison and the future of SemDL 12:15 – 12:30 Existing solutions - JeromeDL 9:45 – 10:30 Conclusions, discussio n 16: 45  - 17: 3 0  Lunch break 12:30 – 1 4 : 0 0 Hands-on session (part I) 14:00  – 15: 3 0 Coffee break 15: 3 0 – 1 6 : 0 0 Hands-on session (part II) 1 6:0 0 – 16: 45 Existing Semantic Digital Libraries solutions  1 1:00  - 12: 15   Coffee break  10:30 – 1 1:00   Introduction to Semantic Digital Libraries  9:00 -  9 : 45   Time
Outline Introduction to Semantic Web Semantic Digital Libraries
The Semantic Web – A Brief Introduction Current Web vs. Semantic Web? An extension of the current Web in which information is given well-defined meaning, better enabling computers and people to work in cooperation.  [Tim Berners-Lee] Current Web was designed for humans, and there is little information usable for machines Was the Web meant to be more? Objects with well defined attributes as opposed to untyped hyperlinks between Internet resources A  network of relationships  amongst named objects, yielding unified information management tasks What do you mean by “Semantic”? the  semantics  of something is the  meaning  of something Semantic Web is able to describe things in a way that computers can understand
The Semantic Web – A Brief Introduction Where are we in the  “S emantic  W eb   layer cake”? You Are Here!
The Semantic Web – A Brief Introduction The challenge for the Semantic Web The Semantic Web can’t work all by itself For example, it is not very likely that you will be able to sell your car just by putting your RDF file on the Web Need society-scale applications: Semantic Web agents and/or services, consumers and processors for semantic data, more advanced collaborative applications
The Semantic Web –  What is RDF ? Describing things on the S emantic  W eb RDF (Resource Description Framework) a  data  format  for describing information and resources,  the fundamental data model for the Semantic Web Using RDF, we can describe relationships between things like: A is a  part  of B or Y is a  member  of  Z and their properties ( size ,  weight ,  age ,  price …) in a machine-understandable format where each thing has a RDF  graph-based model  delivers  straightforward  machine  process ing Putting information into RDF files makes it possible for “scutters” or RDF crawlers to  search ,  discover ,  pick up ,  collect ,  analyse  and  process  information from the Web
The Semantic Web –  What is RDF ? A simple RDF example Statement: “ Stefan Decker  is the  creator  of the resource (web page)   http://www.stefandecker.org ” Structure: Resource (subject) http://www.stefandecker.org Property (predicate)  http://purl.org/dc/elements/1.1/creator Value (object)  “ Stefan Decker ” Directed graph: http://www.stefandecker.org dc:creator Stefan Decker
The Semantic Web – How RDF can help us? How RDF can help us? identify objects establish relationships express a new relationship   just add a new RDF statement  integrate information from different sources    copy all the RDF data together RDF allows many points of view
What is an Ontology? „ An ontology is a specification of a conceptualization.“ Tom Gruber, 1993 Ontologies are social contracts Agreed, explicit semantics Understandable to outsiders (Often) derived in  a community process Ontology markup and representation languages: RDF  and RDF Schema OWL Other:  DAML+OIL ,  EER ,  UML ,  Topic Maps ,  MOF ,  XML Schemas The Semantic Web –  Ontologies and Schemata
Defines small vocabulary for RDF:  Class, subClassOf, type Property, subPropertyOf domain, range Vocabulary can be used to define other vocabularies for your application domain The Semantic Web –  RDF Schema Person Student Researcher subClassOf subClassOf Jeen type hasSuperVisor domain range Frank type hasSuperVisor
OWL – The Web Ontology Language Owl  took Christopher Robin’s notice from Rabbit and looked at it   nervously. He could spell his own name  WOL , and he could spell Tuesday so that you knew it wasn’t Wednesday, and he could read quite comfortably when you weren’t looking over his shoulder and saying "Well?" all the time... provides  a  vocabulary for defining classes, their properties and their relationships among classes. The Semantic Web –  OWL owl :disjointWith s s s s Animal Herbivore Carnivore Omnivore Based on Description Logics OWL is a W3C Recommendation
The Semantic Web –  Applications Semantic Web cannot be and is not only a set of recommendations Semantic Web is  becoming reality by applications  that support it and are based on it Enabling technologies: RDF Storages: Sesame, Jena, YARS Reasoners: KAON, Racer  Editors: Protege, SWOOP, MarcOnt Portal End-User applications: Semantic wikis: Makna, SemperWiki Semantic blogs Semantic digital libraries
Outline Introduction to Semantic Web Semantic Digital Libraries
What is a Semantic Digital Library? Semantic digital libraries integrate  information based on different metadata, e.g.: resources, user profiles, bookmarks, taxonomies  –  high quality semantics = highly and meaningfully connected information provide  interoperability  with other systems (not only digital libraries) on either metadata or communication level or both –  RDF as common denominator between digital libraries and other services delivering more robust,  user friendly and adaptable search and browsing  interfaces empowered by semantics
Old days  of hard-copy books Library: Archive (storage space) Bibliographic cards (metadata) Librarian (interface) Pros: Someone to talk to, to understand us, to explain, help in searching Cons: Based on physical location Libraries are not connected – we have to visit every place
Yesterday of digital books Digital library Database and archive (storage) Digital bibliographic descriptions (metadata)  Full-text search (interface) Pros: Content accessible online Federations of libraries – visit less places Cons: Lonely user - n o one to talk to, we need to find the right keywords, what if we do not know them (“man without an ear” paintings example) Still many problems with interconnecting (different) libraries
Today of interconnected content Semantic Digital Libraries Database and archive (storage) Semantic bibliographic description (interconnected metadata) Search and browsing on ontologies (interface) Pros: Search and browsing based on semantics can help in substituting the librarian It is easier to interconnect heterogeneous libraries (RDF as common denominator) Cons: Semantics created from legacy formats –   still hard to capture by most of average users
Tomorrow of social media Social Semantic Digital Libraries Database and archive (storage) Bibliographic descriptions with annotations provided by users (metadata) Collaborative search and browsing (interface) Pros: Users contribute to the classification process Users can understand community driven annotations Users enhance digital content using blogs, wikis on the side Cons: Not everyone is convinced
How are Semantic Digital Libraries different? Semantic digital libraries extend digital libraries by describing and exposing its resources in a machine ‘understandable’ way resources can be contents, digital artefacts organization of objects (e.g. collections) users, user communities controlled vocabularies, thesauri,  taxonomies expose the semantics of their metadata  in terms of an ontology defined using a formal language deliver mediation services for communication  with other systems
Semantic Web Technologies for Digital Libraries? Metadata is the key  concept the Web  does not have  metadata the idea of a Semantic Web is nice but difficult to  implement many digital libraries  do have  metadata in place we simply must make them available in a machine understandable format the Semantic Web provides the format: RDF
Semantic Web Technologies for Digital Libraries? Knowledge in bibliographic records Digital Libraries  already have  controlled vocabularies, taxonomies or even ontologies in place  the challenge is to model this knowledge in a machine understandable way the Semantic Web provides  ontology  language s:   RDF  Schema OWL SKOS
A Sample Bibliographic Record Copyright 2000 The J. Paul Getty Trust & College Art Association, Inc . Terms taken from Controlled Vocabularies Vincent van Gogh;  painter: Gogh, Vincent van (Dutch painter, 1853-1890) Creation-Creator/Role J. Paul Getty Museum Current Location-Repository Name irises ,  nature ,  soil , etc. Subject-Matter 1889, earliest: 1889, latest: 1889 Creation-Date Irises Title paintings Object/Work type Paintings Classification
Knowledge Organization Systems tools that present the  organized interpretation  of knowledge structures semantic tools -  meaning  of words and other symbols as well as (semantic)  relations  between symbols and concept  organize  information and  promote  knowledge management Examples: classification and categorization  schemata (organize materials at a general level) subject headings  (provide more detailed access) authority files  (control variant versions of key information such as geographic names and personal names) highly structured vocabularies, such as  thesauri traditional schemes, such as semantic networks and  ontologies
Taxonomy of Knowledge Organization Systems Term Lists  Authority files ( FOAF ) Glossaries  Dictionaries  Gazetteers  Classifications and Categories ( DMoz ) Subject headings Classification schemes Taxonomies  Categorization Schemes.  Relationship Lists Thesauri ( WordNet, MeSH ) Semantic networks Ontologies   (Hodge, 2000)
Understanding Knowledge Organization Systems controlled vocabulary   -  a list of terms that have been enumerated explicitly  taxonomy   - a   collection  of controlled vocabulary terms   organized into a  hierarchical  structure.  formal  ontology   –  a controlled vocabulary expressed in an ontology representation language. This language has a  grammar  for using vocabulary terms to express something  meaningful  within a specified domain of interest.  meta-model   -  an explicit model of the constructs and rules needed to build specific models within a domain of interest. A valid meta-model is an ontology, but not all ontologies are modeled explicitly as meta-models. as a set of building blocks and rules used to build models  as a model of a domain of interest, and  as an instance of another model.
Simple Knowledge Organization Systems (SKOS) basic structure and content of concept schemes such as  thesauri,  classification schemes,  subject heading lists,  taxonomies,  'folksonomies ',  other types of controlled vocabulary core concepts: narrower  and  broader isSubjectOf and  subject ; isPrimarySubjectOf and primarySubject member  and Collection; memberList and OrderedCollection related  and semanticRelation note, definition; altLabel and  prefLabel ; symbol and altSymbol
Benefits of Semantic Digital Libraries  Problems of today’s libraries  rapidly growing islands of highly organized information How to find things in a growing information space? is it enough to have a full-text index (à la Google)? typical “end-users” versus “expert users” converging digital library systems e.g. uniform access to Europe’s digital libraries and cultural heritage
Benefits of Semantic Digital Libraries  T he two main benefits of Semantic Digital Libraries new search paradigms for the information space Ontology - based search / facet search Community-enabled browsing providing interoperability on the data level integrating metadata from various heterogeneous sources Interconnecting different digital library systems
Searching the Sample Bibliographic Record Full-text search “ Paintings ” AND “ Van Gogh ” AND “ flowers ”      no result Semantic query if the knowledge that “ irises ” are “ flowers ” is modeled in an ontology (e.g. subclass-hierarchy) we can query for all “ Paintings ” by “ Van Gogh ” with subject “ flowers ”   and retrieve also the picture with subject “ irises ” Copyright 2000 The J. Paul Getty Trust & College Art Association, Inc . Vincent van Gogh;  painter: Gogh, Vincent van (Dutch painter, 1853-1890) Creation-Creator/Role J. Paul Getty Museum Current Location-Repository Name irises ,  nature ,  soil , etc. Subject-Matter 1889, earliest: 1889, latest: 1889 Creation-Date Irises Title paintings Object/Work type Paintings Classification
Semantic Digital Libraries and Existing DL Systems how to handle the legacy (meta-)data problem  lifting existing (meta-)data to a semantic level simple solutions like MARC21   DublinCore complex ontologies like MarcOnt Ontology for capturing concepts from different standards legacy libraries expose their metadata via well established protocols - the metadata can be imported into semantic DLs semantic DLs can play a role of integration champions in the information retrieval process in heterogeneous networks: OAI-PMH Z39.50 Dienst
Application  A reas for Semantic Web  T echnologies Thesauri & Controlled Vocabularies qualified DublinCore DMoz, DDC-based taxonomies SKOS, WordNet and other thesauri Schema Mappings / Crosswalks MarcOnt Ontology – aims to cover concepts from MARC21, BibTeX and DublinCore MarcOnt Mediation Services – an open mediation framework between common legacy metadata standards  Metadata Integration RDF as a common data model for integrating metadata from various autonomous and heterogeneous data sources OWL for modeling the data source’s semantics SPARQL as a common query language
Semantic DL as Evolving Knowledge Space In state-of-the-art digital libraries users are  consumers Retrieve contents based on available bibliographic records Recent trends: user communities Connetea Flickr In Semantic digital libraries users are  contributers  as well Tagging (Web 2.0) Social Semantic Collaborative Filtering Annotations Semantic   Digital libraries enforce the  transition from a static information to a  dynamic (collaborative) knowledge space
Existing Semantic Digital Library Systems JeromeDL a social semantic digital library makes use of Semantic Web and Social Networking technologies to enhance both interoperability and usability BRICKS aims at establishing the organizational and technological foundations for a digital library network in order to share knowledge and resources in the cultural heritage domain. FEDORA delivers flexible service-oriented architecture to managing and delivering content in the form of digital objects SIMILE extends and laverages DSpace, seeking to enhance interoperability among digital assets, schemata, metadata, and services
Tutorial – Semantic Digital Libraries -  Existing Semantic Digital Libraries Solutions  -  Sebastian R. Kruk ,  Bernhard Haslhofer,  Philipp Nußbaumer ,  Sandy Payette,  Tomasz Woroniecki
Existing Semantic Digital Library Systems JeromeDL a social semantic digital library makes use of Semantic Web and Social Networking technologies to enhance both interoperability and usability BRICKS aims at establishing the organizational and technological foundations for a digital library network in order to share knowledge and resources in the cultural heritage domain. FEDORA delivers flexible service-oriented architecture to managing and delivering content in the form of digital objects SIMILE extends and laverages DSpace, seeking to enhance interoperability among digital assets, schemata, metadata, and services
Tutorial 7 – Semantic Digital Libraries -  Existing Semantic Digital Libraries Solutions  – JeromeDL Sebastian R. Kruk , Tomasz Woroniecki
Outline JeromeDL - Motivation and Overview JeromeDL - Architecture and Ontologies JeromeDL - Semantic Services JeromeDL - Social Services JeromeDL - Semantics in Use
JeromeDL -  Introduction Joint effort of DERI, National University of Ireland, Galway and Gdansk University of Technology (GUT) Distributed under BSD Open Source license Digital library build on semantic web technologies to answer requirements from: librarians, scientists and everyone.
Motivation How to integrate  and search  information from different  bibliographic  sources?  How to share and interconnect knowledge among people?
JeromeDL –  Motivations Use Cases Librarians: support for rich metadata (MARC21) in uploading resources,  accessing bibliographic information and searching persistent identifiers Scientists:  easy publishing (designed as a institute/university digital library) creating hierarchical networks of digital libraries support for accessing, sharing and searching using bibliography  metadata (BibTeX) Everyone: simple search (incl. natural language queries)  community-aware information sharing and browsing,  support for interationalization
JeromeDL - Motivations Support for different kinds of bibliographic medatata, like:  DublinCore ,  BibTeX  and  MARC21  at the same time. Making use of existing  rich sources  of bibliographic descriptions  (like MARC21) created by human. Supporting users and communities: user s  ha ve  control over  their  profile information ; community-aware profiles are integrated with bibliographic descriptions support for community generated knowledge Delivering communication between instances: P2P mode for searching and users authentication Hierarchical mode for browsing
Outline JeromeDL - Motivation and Overview JeromeDL - Architecture and Ontologies JeromeDL - Semantic Services JeromeDL - Social Services JeromeDL - Semantics in Use
JeromeDL – Architecture Resources and annotations repository Middleware: query processing community space resources management User interface agents: Communication to the outside world Administrative interface
Bibliographic Description in JeromeDL <?xml  version =&quot;1.0&quot;  encoding =&quot;UTF-8&quot;  ?> <rdf:Description   rdf:about =&quot;http://...id=828374765&quot; > <dc:title> JeromeDL - Adding Semantic Web Technologies to DLs </dc:title> <dc:creator> Sebastian  Kruk </dc:creator> <dc:description> In recent  years... </dc:description> </rdf:Description> 01450cas 922004331i 450000100...019c19329999gw  qr|p|  ||||0  |0ger |  a0044-2992 9a200412140219bVLOADc200404071525dvkulc200310071018dvbjc200303101205dkopumky200209211341zVLOAD  aGD U/MPcGD  U/MPdGD U/MFdGD U/KKsdWR O/EJ0 ager1 aZ. Kunstgesch. 0aZeitschrift für Kunstgeschichte00aZeitschrift für Kunstgeschichte.18aZfK  aMünchen ;aBerlin :bDeutscher Kunstverlag,c1932-.  c26-29 cm.  aKwart.0 a1 Bd. (Juni 1932)-.  aOpis na podst.: LCC.  aW 1932 założycielami czasopisma byli Wilhelm Waetzoldt i Ernst Gall....  These all can be represented in RDF @ InProceedings  { jeromedexa2005, author  = &quot;Sebastian Ryszard Kruk and  ... &quot;, title  = &quot;{JeromeDL - Adding Semantic  ...}&quot;, booktitle  = &quot;{In Proceedings to DEXA 2005}&quot;, year  = 2005}
Structure ontology in JeromeDL
Bibliographic (MarcOnt) Ontology in JeromeDL
Community-aware (FOAFRealm) ontology
Ontologies in JeromeDL
Metadata and Services in JeromeDL
Outline JeromeDL - Motivation and Overview JeromeDL - Architecture and Ontologies JeromeDL - Semantic Services JeromeDL - Social Services JeromeDL - Semantics in Use
Semantic  Metadata and Services
MarcOnt Initiative – Overview Motivation: Provide set of tools for  collaborative ontology development MarcOnt Initiative goals: Create a framework for collaborative ontology improvement (E-learning) Provide domain experts with tools to share their knowledge Offer tools for data mediation between different data formats
MarcOnt Portal and MarcOnt Ontology MarcOnt Ontology: Central point of MarcOnt Initiative Translation and mediation format Continuos collaborative ontology improvement Knowledge from the domain experts MarcOnt Portal (source of knowledge): Suggestions Annotations Versioning Ontology editor
MarcOnt Mediation Services for Legacy Metadata Format translation RDF Translator Format co-operation MarcOnt Mediation Services
Browsing the data graph  – why? The search  does not end  on a (long) list of results The results are not a list (!) but a  graph „ Lost  in hyperspace” A need for  unified UI and services  for filter/narrow and browse/expand services Share browsing experience –  navigate collaboratively
Browsing the data graph  – how? Defines  REST  access to services and their composition Basic services:  access, search, filter, similar, browse, combine Meta services : RDF serialization, subscription channels, service ID generation ,  Context services : manage contexts, manage service calls/compositions in the context, lists contexts Statistics services : properties, values,  tokens
Browsing the data graph JeromeDL exploits interconnected data
Browsing the data graph …  to allow browsing
Outline JeromeDL - Motivation and Overview JeromeDL - Architecture and Ontologies JeromeDL - Semantic Services JeromeDL - Social Services JeromeDL - Semantics in Use
Semantic  Metadata and Services
Social Services in JeromeDL Involve users into sharing knowledge Blogs – comments and discussions about documents and resources  Tagging – collaborative classification Wikis – collaboratively edited additional descriptions, such as summaries and interesting facts Preserve knowledge for future use Users can learn from experience of others instantly Recommend new, interesting resources based on users’ profiles
FOAF - Describing Social Networks FOAF - Stands for Friend-of-a-Friend Defines properties for a person (but it does not have to be a person, can be an “agent”) Does not only have to contain one person per file Can build a network of people with foaf:knows links FOAF can be easily extended to meet requirements, as in the case of FOAFRealm for identity management…
Identity management with FOAFRealm Identity defined with extended FOAF metadata Policies expressed by social networking  Distance between owner and requester Friendship level between owner and requester, calculated across digraph of social network Support for single registration and sign on Distributed identity management with HyperCuP (“D-FOAF”) FOAFRealm is currently implemented as a plugin for Tomcat (Realm/Valve implementation), with PHP and .NET versions coming soon
Social Semantic Collaborative Filtering Why? The bottom-line of acquiring knowledge:  informal communication  (“word of mouth”)  How? Everyone classifies (filters) the information in bookmark folders ( user-oriented taxonomy ) Peers share (collaborate over) the information ( community-driven taxonomy ) Result? Knowledge “flows“  from the expert  through the social network to the user System amass a lot of information  on  user/community profile (context)
Social Semantic Collaborative Filtering Problems? The  horizon of a social network  (2-3 degrees of separation) How to handle  fine-grained information  (blogs, wikis, etc.) Solutions?  Inference engine to  suggest knowledge  from the outskirts of the social network Support for  SIOC metadata : SIOC browser in SSCF Annotations and evaluations of “local” resources
What is  S ocial  S emantic  C ollaborative  F iltering? Goal:   t o enhance individual bookmarks with shared knowledge within a community Users annotate catalogues of bookmarks with semantic information taken from DM oz  or WordNet vocabularies Catalogs can include ( transclusion ) friend's catalogues Access to catalogues can be restricted with social networking-based polices SSCF delivers: Community-oriented, semantically-rich taxonomies Information about a user's interest  Flows of expertise from the domain expert Recommendations based on users previous actions Support for SIOC metadata
Social Semantic Collaborative Filtering foaf:knows xfoaf:include xfoaf:bookmark
Social Networks in Digital Libraries Resource xfoaf:Annotation user_C creator_B foaf:knows marcont:hasCreator creator_A foaf:knows foaf:knows xfoaf:Directory user_D xfoaf:owns xfoaf:linksTo xfoaf:isIn
Support for online communities in SSCF
Support for online communities in SSCF
Outline JeromeDL - Motivation and Overview JeromeDL - Architecture and Ontologies JeromeDL - Semantic Services JeromeDL - Social Services JeromeDL - Semantics in Use
JeromeDL – Delivering Semantic Content Providing semantic annotations during uploading process: open module for handling any taxonomies keywords based on WordNet and free tagging defining structure of resources in the JeromeDL ontology Lifting legacy metadata to MarcOnt ontology Community maintained annotations social semantic collaborative filtering semantic descriptions based on the FOAF metadata
Annotating Library Resources
JeromeDL – Semantic Information In Use Searching: Keyword-based search with semantic query expansion Semantic search: Direct RDF quering Natural language templates Browsing Exibit MultiBeeBrowse Sharing: Social Semantic Collaborative Filtering Semantically Interlinked Online Communities Heterogeneous communication: Bibster ,  A9 ,  OAI -PMH
Exposing Semantic Annotations
Filtering Resources in JeromeDL
Sharing Knowledge with SSCF
Information Retrieval in JeromeDL Fulltext Index Structure Repository MarcOnt Repository Resources’ Content FOAFRealm Repository (typed) keywords RDF & NL Query OpenSearch RSS collaborative filtering types translation semantic query expansion RDF Repositories Secure Snapshot local interface distributed interface
Networks of Digital Libraries  ELP (Extensible Library Protocol) implementation communication within JeromeDL network adapters for communication with other networks D-FOAF integration (distributed user profile management) single sign on and single registration within D-FOAF network HyperCuP integration (scalable P2P network) Independent ELP network entry point: http://search.jeromedl.org/ 0 0 1 1 0 0 1 1 0 2 2 2 2
Tutorial – Semantic Digital Libraries -  Existing Semantic Digital Libraries Solutions  –  BRICKS Bernhard Haslhofer University of Vienna Austria Philipp Nußbaumer  Research Studios Austria
Outline BRICKS Overview BRICKS Components BRICKS Applications
What is BRICKS? A software infrastructure for building digital library networks Transparent access to distributed resources Multilinguality Easy installation & maintainance A set of end-user applications Network & content management Web 2.0 tagging/annotations Domain specific applications A business model Open source, platform independent Low cost infrastructure User communities    sustainability
BRICKS Architecture A decentralized P2P network Avoid central coordination Highly Scalable, increased reliability Minimized maintainance costs Each P2P Node is a set of SOA components Web Service interface Platform independent Flexible composition Components for Storing, accessing and protecting digital objects (Semantic) search & browsing P2P commmunication
Accessing Data
A Look into a BNode { BNode
Outline BRICKS Overview BRICKS Components BRICKS Applications
Collection Manager Single access point for all content and metadata related operations (local and remote) Physical Collection Similar to folder/directory hierarchy in a file system Bound to a single BNode Each digital content object belongs to exactly one collection Logical Collection Virtual folder for organizing content items independent of their physical location  Links to content items from various physical collections on different BNodes A content item might belong to many of them Stored Query similar to database views
Content Manager Two ways to handle content in BRICKS Stored locally at site of a member party, accessed via URL Stored within BRICKS Based on Java Content Repository (JCR) Provides a meta-content model Re-use of existing content models Use standard models
Metadata Manager Metadata descriptions     RDF Suitable for any application scenario Express relationships between objects React to changes without changing the model Schema defintions     OWL No fixed schema Extensible (e.g. Application profiles) Semantic concepts instead of schematic strucutures SPARQL Metadata queries over ontology concepts Queries for graph patterns
Security Manager Transparently invoked by the Framework any service call is checked Context-aware policies based on RBAC (via XACML rules) supporting Roles, Groups, at DLObject level Permission declaration through Javadoc @tags Federated identity is managed through an adapted version of OpenSAML Reputation-based Trust calculation integrated Web-based GUI for security configuration
Digital Rights Management DRM Component Support for licenses based on  MPEG-21 REL license declaration standard Generic API for the integration of commercial DRM systems Watermarking Open-source watermarking tool for images Other tools can be integrated BRICKS Store web application for commercial content Creative Commons support for other content in BRICKS
Outline BRICKS Overview BRICKS Components BRICKS Applications
Application: BRICKS Workspace  What does it demonstrate? A web application (thin client) accessing BRICKS Foundation services Web 2.0 image annotations Reference application Primary customers General end-users (citizens) Application developers Technology Struts based interface to the BCH
Application: BRICKS Desktop  What does it demonstrate? A rich client application accessing BRICKS foundation services Direct access to the BCHN Primary customers Expert end-users (researchers, educators) Application developers Technology Eclipse based rich client interface
Application: Annotation Tool What does it demonstrate? Tool which allows end-users to annotate images Creation of annotation threads Supervised Annotations Primary customers End-users Institutions with large image collections Technology Web Application
Application: Online Exhibition Authoring Tool What does it demonstrate? Creating and publishing online exhibitions using contents that is available in the BRICKS network Primary customers? Expert end-users (curators) Technology Web Application
Application: Archeological Finds Identifier What does it demonstrate? A web application for comparing findings (e.g. ancient coins) with objects in reference collections  Application of complex domain ontology (CIDOC-CRM) Map visualization of GIS-Metadata Primary customers? Museum curators, archaeologists, students, amateurs, Technology Struts based interface
References BRICKS Community Web Site http://www.brickscommunity.org/ Main Contact: silvia.boi@metaware.it Related (de-facto) standards Resource Description Framework (RDF) http://www.w3.org/TR/rdf-primer/ OWL Web Ontology Language (OWL) http://www.w3.org/TR/owl-guide/ SPARQL http://www.w3.org/TR/rdf-sparql-query/ Java Content Repository (JCR) http://www.jcp.org/en/jsr/detail?id=170 Tools and Libraries Jackrabbit http://jackrabbit.apache.org/ Jena Semantic Web Framework http://jena.sourceforge.net/
Tutorial – Semantic Digital Libraries -  Existing Semantic Digital Libraries Solutions  –  Fedora Sandy Payette Director, Fedora Project Cornell University
Outline Fedora Examples: PLoS ONE and National Science Digital Library
Fedora Semantic Digital Libraries enable … Scholarly and Scientific   Workbenches “ Web 2.0” Collaborative Repositories Museum   Exhibits   with   Lesson   Plans Linking   Data   and   Publications blog and wiki
The Fedora Project Fedora F lexible E xtensible  D igital  O bject R epository A rchitecture History Cornell Research (1997-2002)  DARPA and NSF-funded research and reference implementations Distributed, Interoperable Repositories (experiments with CNRI) Open Source Project (2002-present) Andrew W. Mellon Foundation (2002-2009) Joint development by Cornell University and University of Virginia Transitioning into non-profit organization (Fedora Commons 501c3)
Arts and Humanities
Sciences Education
Fedora - Technology Integration Semantic Repository Enterprise Preservation Information Networks Contextualization Relationships Query Inference Workflow Messaging Transactions Replication Digital Objects Manage Access Versioning Storage Integrity Check Monitoring Alerting Migration
Fedora Digital Objects Flexible object model can support Documents, articles, journals Electronic Scholarly Texts Digital Images Complex multimedia publications  Datasets Metadata Learning objects More… Create “networks” of objects using RDF Define object relationships and other properties via RDF Collection/member; part/whole; etc.
RDF in the Fedora Digital Object Model
Motivations:  Fedora and Semantic Technologies A natural model for exposing repository as network of objects Object-to-object relationships Relationships to external entities Query the graph; traversal to discover related stuff Indexing based on generalizable data model Graph-based data model is a common reduction Avoid fixed schema problems and metadata mud wrestling  Extensible enrichment of object descriptions Keep overlaying statements from multiple ontologies Organic evolution Powerful queries and inference for repository management Transitive relationships among objects Dependency analysis;  Detection/Extraction of sub-graphs Provenance of disseminations
Digital Objects contain their RDF assertions Assert relationships from Fedora base ontology Collection – member Whole – part Equivalence Description Of More… Assert relationships/properties from community ontologies isAnnotationOf isRecommendedBy isCertifiedBy More ….
Example: Digital Objects with “compositional semantics”
Use Case:  scholarly objects and annotation in the humanities musuem and library objects commercial web content scholarly objects URI-100 xx:recommends URI-55 yy:certifies
3 Objects – 3 RDF “Relationships” Datastreams <rdf:Description rdf:about=&quot;info:fedora/uva:pid-11>   <ais:annotationOf rdf:resource=“info:fedora/uva:pid-3”/> </rdf:Description> </rdf:RDF> <rdf:Description rdf:about=&quot;info:fedora/uva:pid-3&quot;> <uva:hasPartLetter rdf:resource=&quot;info:fedora/uva:pid-2&quot;/>   <uva:hasPartDiagram rdf:resource=&quot;info:fedora/uva:pid-1&quot;/> </rdf:Description> </rdf:RDF> <rdf:Description rdf:about=&quot;info:fedora/uva:pid-10>   <ais:providesContextFor rdf:resource=“info:fedora/uva:pid-3”/> </rdf:Description> </rdf:RDF>
NOT the core object store - RI is a graph-based index of the repository Automatic, incremental indexing into triplestore Search/query the repository via Fedora RI Query Interface Fedora RDF-based Resource Index (RI) RDF Index of Repository RDF datastream Fedora object properties DC datastream Digital Object Store
RI Graph - view 1 (abbreviated) …
RI Graph - view 2 (abbreviated) …
RI Implementation: The Triplestore Challenge Scalability Few triplestores perform well  for 100M+ triples Kowari – we tested to 180M triples MPTStore – we tested to 250M triples Performance Jena - easy to get out of memory Sesame Native - slow for complex queries  Kowari  Fast queries and full-featured query language (iTQL) Instability and corruption problems MPTStore Very fast for SPO queries (limited support for complex queries) Add/modify significantly faster than Kowari Mulgara Fork of Kowari; complex queries; models; inference Major bug fixes to fix stability and corruption problems XA2 transactions Claims support for billions of triples
Fedora Repository – Notable Features Generic Digital Object Model Automatic content versioning and audit trail Web Service Interfaces (REST and SOAP) Authentication Authorization Flexible fine-grained policy enforcement Built-in support for Extensible Access Control Markup Language (XACML) RDF Each object contains its own RDF assertions Repository-wide index of all object (RDF triplestore) Self-healing – rebuild repository via digital object source files
Outline FEDORA  Examples: PLoS ONE and National Science Digital Library
PLoS ONE and Topaz Open Access Publishing and Collaboration
NSDL:  Semantic Digital Library Architecture NDR
What is NSDL committed to? NSDL 2.0 as a platform for a collaborative, contributory semantic digital library Supporting communities across the full range of science, technology, engineering and mathematics research, learning and education Supporting the creation of context around library resources to enhance discovery, use, and understanding
NSDL Semantic Digital Library repository requirements Supports storing both content and metadata Allows arbitrary relationships among resource and metadata objects: organization, annotation, citation Accessible through web service architecture of remixable data sources and transformations
NSDL Data Repository (NDR) Implemented in Fedora 2.2 with MPTStore  Moderately large 4.7 million digital objects 250 million RDF triples Digital Objects Resources Metadata Agents Metadata providers Aggregators REST API and authentication In production at nsdl.org
NSDL as Semantic Digital Library :  collaboration, context, and contribution Platform:  Fedora repository and services Applications: Solution 1: Leverage the existing successful models: blogs, wikis, bookmarking/tagging Solution 2: Leverage the existing software: WordPress, MediaWiki, Connotea, Sakai Solution 3: Engage with partners and the broader community to build applications to the platform
Expert Voices -  Blogs on top of Fedora
Expert Voices NSDL Blogosphere (http://expertvoices.nsdl.org) Topic-based discussions (e.g. forensics) linked to related library resources A way for NSDL community members to become NSDL contributors of resources, questions, reviews, annotations, metadata Technology: Wordpress-based multi-user multi-blog application (open source, plug-in architecture) Owner controls publication of entries as NSDL resources and visibility of comments  (NSDL middleware and Shibboleth) Blog Entries: linked references to NSDL library resources
 
 
 
NSDL 2.0 – The Whole Ecosystem … Protocol: OAI-PMH HTTP REST NDR API STEM Collections Search Service Archive Service Fedora-based   NDR
NSDL 2.0 and the Semantic Web NSDL 2.0 applications situate resources in context, aiding both discovery and use Users become contributors, adding new resources, ratings, annotations, and organizational structure – frequently as a side effect of using the library Fedora-based semantic web technology organizes resources, ties context to content, maintains provenance, enables discovery, empowers the user, and powers the library
Fedora Web Site:  www.fedora.info Community Open Source Tools:  www.fedora.info/tools Fedora Wiki:  www.fedora.info/wiki Tutorial: :  http://openarchives.org/fedora/ESWC-Fedora.zip
Tutorial – Semantic Digital Libraries -  Comparison and the Future  -  Sebastian R. Kruk ,  Bernhard Haslhofer,  Philipp Nußbaumer ,  Sandy Payette,  Tomasz Woroniecki
Outline SIMILE – short overview Comparison between existing solutions Digital Libraries and Social Web Semantic Digital Libraries Scenarios
SIMILE – Introduction SIMILE - Semantic Interoperability of Metadata and Information in unLike Environments joint project conducted by the W3C, HP, MIT Libraries, and MIT's Lab for Computer Science.  extends and laverages  DSpace, seeking to enhance interoperability among digital assets, schemata, metadata, and services Goal:  Make  metadata interoperability  easier for digital libraries by providing useful tools for browsing, searching and mapping heterogeneous metadata in RDF   [ MacKenzie Smith, MIT Libraries ]
SIMILE – Introduction SIMILE : enhances interoperability and provides end-user services: for digital assets, arbitrary schemata, metadata and services. across distributed individual, community, and institutional stores.  though the application of RDF and semantic web techniques.  implements a digital asset dissemination architecture based upon web standards
SIMILE – Delivered Components Tools for Metadata Managers Gadget - XML inspector RDFizers - Batch tools to transform existing XML data into RDF Solvent - Firefox extension for Javascript screen scraping Welkin - Graphical tool to inspect/edit RDF graph Tools for End-Users Longwell - Web-based RDF faceted metadata browser Piggy Bank - Firefox extension for personal information management of metadata in RDF Semantic Bank - Web-based server that allows data publishing and sharing by individuals, groups, or communities Exibit  -  lightweight structured data publishing framework Timeline  -  AJAXy widget for visualizing time-based events
RDFizers -  T ransform XML data into RDF RDFizers - Transform XML data into RDF: tools that allow to transform existing data into an RDF representation  List of RDFizers in SIMILE: MARC/MODS    RDF OAI-PMH    RDF OCW    RDF EMail    RDF BibTEX    RDF Flat    RDF Weather    RDF Java    RDF Javadoc    RDF Jira    RDF Subversion    RDF Random    RDF
Solvent - Java S cript screen scraping Solvent - JavaScript screen scraping: a Firefox extension that helps write Javascript  screen scrapers for Piggy Bank. Motivation: Piggy Bank needs web pages to embed information in  RDF . Unfortunately ,  not many web pages embed or link to RDF information . Piggy Bank is capable to execute a particular screen scraper on particular pages in order to &quot;extract&quot; the information it needs. turn s  a regular web page into a semantic web page, freeing the data from the page/site that contains it.
Solvent - Java S cript screen scraping
Longwell - RDF faceted metadata browser
PiggyBank Firefox extension for managing metadata - Loads RDF into local Longwell server Search and faceted browse of local RDF - Views defined by library, other users Users can find, collect, annotate RDF - Can then publish for access by others
PiggyBank
SemanticBank Semantic Bank use cases: persist information remotely on a server share information with other people lets you publish your information, both in RDF or to regular web pages f or individuals, groups, communities - e.g. conference proceedings t he ability to tag resources creates a powerful serendipitous categorization Longwell facetted browsing view  of published information
Timeline
Exibit
Outline SIMILE – short overview Comparison between existing solutions Digital Libraries and Social Web Semantic Digital Libraries Scenarios
System Features Comparison General Properties JeromeDL BRICKS Fedora OS Support Any Any Any Hardware Requirements 500MB RAM, min 128MB HD 500MB RAM, min 100MB HD 500MB RAM, min 100MB HD Software Requirements Java 1.5, Tomcat 5.5, Sesame Java 1.4/1.5, Jena Java 1.5, Tomcat, Kowari/Mulgara or MPTStore Current Stage Research Stable version 2.0.1 Second Prototype Production Version 2.2 No. Installations 12+ ~ 8 ~50 monitored;  large # of downloads unmonitored Support Model Open Source Open Source Open Source
System Features Comparison Architectural Aspects JeromeDL BRICKS Fedora Distribution Distributed searching (P2P), aggregated browsing (hierarchical) Fully decentralized (P2P) federation via nameresolver search services; Alvis P2P  Architecture Granularity Low (main building blocks) High (many Components) High (core repository service with configurable modules; loosely coupled services) DB - Support Any Sesame-compliant backend Any Jena compliant backend MySQL, Postgres, Oracle, McKoi; Kowari/Mulgara
System Features Comparison Content & Metadata Aspects JeromeDL BRICKS Fedora Content Types All All All Content Models JeromeDL ontology Any Any Metadata Schema MarcOnt + extensions Any RDF/S & OWL schema Any XML Schema, RDF/S & OWL schema Query types Full-text, Filed-Search, Ontology-based, NL Query Templates Full-text, Field-Search, Ontology-based (sparql) Field Search, Ontology-based (itql, rdql, sparql, spo), Full-Text (Lucene or Zebra backed service)
System Features Comparison Security & DRM Aspects JeromeDL BRICKS Fedora Security Model FOAFRealm RBAC XACML Policy Granularity Resource Component, Method, Object Object, Datastream, Dissemination method DRM Model Fair use DRM under development MPEG-21 REL DRM Datastreams DRM Enabling Tool Support Watermarking
System Features Comparison Semantic Aspects & Community Features JeromeDL BRICKS Fedora Reasoning Recommendation engine based on Prolog Configurable inference engine Holding pattern; look to Mulgara;  Tagging Free tagging, Wordnet-based Annotation middleware component middleware/apps (e.g., NSDL/NDR; PLoSONE/Topaz) Taxonomies Any (JOnto) Any Any Knowledge Sharing SSCF component via middleware upon BRICKS via middleware upon Fedora Communities SIOC and FOAF compli a nce
Outline SIMILE – short overview Comparison between existing solutions Digital Libraries and Social Web Semantic Digital Libraries Scenarios
The future - Social Semantic Digital Libraries Why current (semantic) digital libraries are not enough? digital libraries should not be for librarians only but for average people they concentrate on delivering content/information, not on knowledge sharing within a community of users digital libraries have lost human-part of their predecessors
The future - Social Semantic Digital Libraries What could be the solution? make users/readers involved in the content annotation process allow users/readers to share their knowledge within a community provide better communication between users in and across communities
The future - Social Semantic Digital Libraries What is Web 2.0? The Web where “ordinary” users can meet, collaborate, and share using whatever is newly popular on the Web (tagged content, social bookmarking, AJAX, etc.) The term Web 2.0 was made popular by Tim O’Reilly: http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html Popular examples include: Bebo, del.icio.us, digg, Flickr, Google Maps, Skype, Technorati, Wikipedia…
The future - Social Semantic Digital Libraries (3) Web 2.0 focuses include: The Web as a platform for social and collaborative exchange Reusable community contributions Subscriptions to information, news, data flows, services Mass-publishing using web-based social software Social software for communication and collaboration: IM, IRC, Forums, Blogs, Wikis, Social Network Services, Social Bookmarks, MMOGs…
Social Semantic Information Spaces
Comparing Web 1.0 / Web 2.0 / Semantic Web 2.0 Semantic Social Networks Online Social Networks Buddy Lists, Address Books Semantic Social Information Spaces - - Social Semantic Digital Libraries Google Scholar, Book Search CiteSeer, Project Gutenberg Semantic Forums and Community Portals Community Portals Message Boards Semantic Blogs Blogs Personal Websites Semantic Search Google Personalised, DumbFind Altavista, Google Semantic Wikis Wikis Content Management Systems Semantic Web 2.0 Web 2.0 Web 1.0
Outline SIMILE – short overview Comparison between existing solutions Digital Libraries and Social Web Semantic Digital Libraries Scenarios
Geo, Time, and Machine Tagging Geo-tagging  for resources with a specific geographical location Time-tagging  – community driven process of assigning auxiliary multimedia content  Machine-tagging  – ability to mix structured annotations into tags ROI-tagging : Regions of interest ERP game Asynchonous version with annealing of annotations for less frequently visited libraries
SDL in eLearning One of potential sources of future e-Learning systems On the verge between formal (libraries) and informal (communities) learning sources Semantic interoperability with Learning Management Systems Improve knowledge creation, delivery and sharing
SDL in Future Museums Museums have physical objects Should bind digital annotations with physical objects Real-virtual tours Start with real, guided tour Ubiquitous browse through context information Locate other exhibitions in the vicinity  Share your knowledge and experience with others, leave bread-crumbs for others Get the most of the exhibition during your visit
Discussion – Feedback The Librarian from Unseen University  in Ankh-Morpork  (formerly Dr. Horace Worblehat)

More Related Content

PPT
Digital Libraries of the Future: Use of Semantic Web and Social Bookmarking t...
PDF
PPT
Tutorial on Semantic Digital Libraries (WWW'2007)
ZIP
Semantic Digital Libraries
PPT
Semantic Web Technologies For Digital Libraries
PPT
JeromeDL - the Semantic Digital Library
PPT
Corrib.org - OpenSource and Research
PPT
The Semantic Web and Libraries in the United States: Experimentation and Achi...
Digital Libraries of the Future: Use of Semantic Web and Social Bookmarking t...
Tutorial on Semantic Digital Libraries (WWW'2007)
Semantic Digital Libraries
Semantic Web Technologies For Digital Libraries
JeromeDL - the Semantic Digital Library
Corrib.org - OpenSource and Research
The Semantic Web and Libraries in the United States: Experimentation and Achi...

What's hot (20)

PDF
Better Search With Structured Knowledge
PDF
From the Semantic Web to the Web of Data: ten years of linking up
PDF
Role of Ontologies in Semantic Digital Libraries
PDF
Introduction to the Semantic Web
PDF
SDA2013 Pundit: Creating, Exploring and Consuming Annotations
PPT
Porting Library Vocabularies to the Semantic Web - IFLA 2010
PPT
Geo-annotations in Semantic Digital Libraries
PPT
JeromeDL Tutorial
PPTX
Semantic web
PPTX
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
PDF
Capture the 20:20 Vision for Libraries
PDF
09 semantic web & ontologies
PPTX
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
PPTX
Introduction to the Semantic Web
PDF
NISO DCMI Webinar bibframe-20130123
PPT
Metadata Training for Staff and Librarians for the New Data Environment
PPT
Neno/Fhat: Semantic Network Programming Language and Virtual Machine Specific...
KEY
Introduction to the Semantic Web
PPTX
NISO/DCMI Webinar: Metadata for Public Sector Administration
PPTX
NISO/DCMI Webinar: Cooperative Authority Control: The Virtual International A...
Better Search With Structured Knowledge
From the Semantic Web to the Web of Data: ten years of linking up
Role of Ontologies in Semantic Digital Libraries
Introduction to the Semantic Web
SDA2013 Pundit: Creating, Exploring and Consuming Annotations
Porting Library Vocabularies to the Semantic Web - IFLA 2010
Geo-annotations in Semantic Digital Libraries
JeromeDL Tutorial
Semantic web
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
Capture the 20:20 Vision for Libraries
09 semantic web & ontologies
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
Introduction to the Semantic Web
NISO DCMI Webinar bibframe-20130123
Metadata Training for Staff and Librarians for the New Data Environment
Neno/Fhat: Semantic Network Programming Language and Virtual Machine Specific...
Introduction to the Semantic Web
NISO/DCMI Webinar: Metadata for Public Sector Administration
NISO/DCMI Webinar: Cooperative Authority Control: The Virtual International A...
Ad

Similar to Tutorial on Semantic Digital Libraries (ESWC'2007) (20)

PPT
Digital Libraries of the Future
PPT
Irish Digital Libraries Summit
DOC
Semantic web
PDF
WebGUI And The Semantic Web
PPT
Semantic Web in Action
PDF
The Semantic Web in Digital Libraries: A Literature Review
PPT
Semantic Web
PPTX
Semantic web
PDF
Semantic web
PPTX
unit 1.pptx
PPTX
How does semantic technology work?
PPT
DM110 - Week 10 - Semantic Web / Web 3.0
PPT
IWMW 2003: Semantic Web Technologies for UK HE and FE Institutions (Part 2)
PPT
Future of Web 2.0 & The Semantic Web
PPT
A review of the state of the art in Machine Learning on the Semantic Web
DOCX
Semantic web1
PPTX
Web 3 final(1)
PPTX
Breaking Down Walls in Enterprise with Social Semantics
PPT
PPTX
The Social Semantic Web
Digital Libraries of the Future
Irish Digital Libraries Summit
Semantic web
WebGUI And The Semantic Web
Semantic Web in Action
The Semantic Web in Digital Libraries: A Literature Review
Semantic Web
Semantic web
Semantic web
unit 1.pptx
How does semantic technology work?
DM110 - Week 10 - Semantic Web / Web 3.0
IWMW 2003: Semantic Web Technologies for UK HE and FE Institutions (Part 2)
Future of Web 2.0 & The Semantic Web
A review of the state of the art in Machine Learning on the Semantic Web
Semantic web1
Web 3 final(1)
Breaking Down Walls in Enterprise with Social Semantics
The Social Semantic Web
Ad

More from Sebastian Ryszard Kruk (14)

PDF
Sieć Semantyczna w teorii i praktyce
PDF
Web 3.0 w teorii i praktyce
PDF
JeromeDL - Semantic Digital Library
PDF
Knowledge Management with Web 3.0
ZIP
węzełki.pl - knowledge sharing portal on Web 3.0
PDF
Ecdl2008 Jeromedl Evaluation Long
PDF
Rendering Navigation and Information Space with HoneyCombTM
PPT
Building Heterogeneous Networks of Digital Libraries on the Semantic Web
PDF
MultiBeeBrowse - Accessible Browsing on Unstructured Metadata
PDF
Search and Browsing Cycle for Knowledge Discovery and Learning
PPT
Social Semantic Digital Libraries in a Nutshell
PPT
Social Semantic Search and Browsing
ODP
Browsing Information with TreeMaps
PPT
Social Semantic Collaborative Filtering
Sieć Semantyczna w teorii i praktyce
Web 3.0 w teorii i praktyce
JeromeDL - Semantic Digital Library
Knowledge Management with Web 3.0
węzełki.pl - knowledge sharing portal on Web 3.0
Ecdl2008 Jeromedl Evaluation Long
Rendering Navigation and Information Space with HoneyCombTM
Building Heterogeneous Networks of Digital Libraries on the Semantic Web
MultiBeeBrowse - Accessible Browsing on Unstructured Metadata
Search and Browsing Cycle for Knowledge Discovery and Learning
Social Semantic Digital Libraries in a Nutshell
Social Semantic Search and Browsing
Browsing Information with TreeMaps
Social Semantic Collaborative Filtering

Recently uploaded (20)

PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PPTX
OMC Textile Division Presentation 2021.pptx
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PPTX
A Presentation on Touch Screen Technology
PPTX
TLE Review Electricity (Electricity).pptx
PDF
Encapsulation theory and applications.pdf
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
project resource management chapter-09.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
A novel scalable deep ensemble learning framework for big data classification...
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PDF
Hybrid model detection and classification of lung cancer
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
NewMind AI Weekly Chronicles - August'25-Week II
Programs and apps: productivity, graphics, security and other tools
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
OMC Textile Division Presentation 2021.pptx
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
A Presentation on Touch Screen Technology
TLE Review Electricity (Electricity).pptx
Encapsulation theory and applications.pdf
Group 1 Presentation -Planning and Decision Making .pptx
1 - Historical Antecedents, Social Consideration.pdf
Unlocking AI with Model Context Protocol (MCP)
project resource management chapter-09.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
A novel scalable deep ensemble learning framework for big data classification...
gpt5_lecture_notes_comprehensive_20250812015547.pdf
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
Hybrid model detection and classification of lung cancer
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx

Tutorial on Semantic Digital Libraries (ESWC'2007)

  • 1. Tutorial – Semantic Digital Libraries - Introduction - Sebastian R. Kruk , Bernhard Haslhofer, Philipp Nußbaumer , Sandy Payette, Tomasz Woroniecki
  • 2. Tutorial overview Who we are Sebastian R. Kruk, DERI Galway – Ireland Bernhard Haslhofer, University of Vienna - Austria Phillip Nußbaumer, Research Studios - Austria Sandy Payette, Cornell University – USA Tomasz Woroniecki, DERI Galway – Ireland Today we want to give you a brief introduction to the Semantic Web, and show how SW is related to digital libraries present existing semantic digital library systems discuss the current problems and future directions of semantic digital libraries and get feedback from you After this tutorial you will know what is the semantic digital library system existing solutions in various degrees of detail how to run semantic digital library solutions on your machine
  • 3. Tutorial Schedule Comparison and the future of SemDL 12:15 – 12:30 Existing solutions - JeromeDL 9:45 – 10:30 Conclusions, discussio n 16: 45 - 17: 3 0 Lunch break 12:30 – 1 4 : 0 0 Hands-on session (part I) 14:00 – 15: 3 0 Coffee break 15: 3 0 – 1 6 : 0 0 Hands-on session (part II) 1 6:0 0 – 16: 45 Existing Semantic Digital Libraries solutions 1 1:00 - 12: 15 Coffee break 10:30 – 1 1:00 Introduction to Semantic Digital Libraries 9:00 - 9 : 45 Time
  • 4. Outline Introduction to Semantic Web Semantic Digital Libraries
  • 5. The Semantic Web – A Brief Introduction Current Web vs. Semantic Web? An extension of the current Web in which information is given well-defined meaning, better enabling computers and people to work in cooperation. [Tim Berners-Lee] Current Web was designed for humans, and there is little information usable for machines Was the Web meant to be more? Objects with well defined attributes as opposed to untyped hyperlinks between Internet resources A network of relationships amongst named objects, yielding unified information management tasks What do you mean by “Semantic”? the semantics of something is the meaning of something Semantic Web is able to describe things in a way that computers can understand
  • 6. The Semantic Web – A Brief Introduction Where are we in the “S emantic W eb layer cake”? You Are Here!
  • 7. The Semantic Web – A Brief Introduction The challenge for the Semantic Web The Semantic Web can’t work all by itself For example, it is not very likely that you will be able to sell your car just by putting your RDF file on the Web Need society-scale applications: Semantic Web agents and/or services, consumers and processors for semantic data, more advanced collaborative applications
  • 8. The Semantic Web – What is RDF ? Describing things on the S emantic W eb RDF (Resource Description Framework) a data format for describing information and resources, the fundamental data model for the Semantic Web Using RDF, we can describe relationships between things like: A is a part of B or Y is a member of Z and their properties ( size , weight , age , price …) in a machine-understandable format where each thing has a RDF graph-based model delivers straightforward machine process ing Putting information into RDF files makes it possible for “scutters” or RDF crawlers to search , discover , pick up , collect , analyse and process  information from the Web
  • 9. The Semantic Web – What is RDF ? A simple RDF example Statement: “ Stefan Decker is the creator of the resource (web page) http://www.stefandecker.org ” Structure: Resource (subject) http://www.stefandecker.org Property (predicate) http://purl.org/dc/elements/1.1/creator Value (object) “ Stefan Decker ” Directed graph: http://www.stefandecker.org dc:creator Stefan Decker
  • 10. The Semantic Web – How RDF can help us? How RDF can help us? identify objects establish relationships express a new relationship  just add a new RDF statement integrate information from different sources  copy all the RDF data together RDF allows many points of view
  • 11. What is an Ontology? „ An ontology is a specification of a conceptualization.“ Tom Gruber, 1993 Ontologies are social contracts Agreed, explicit semantics Understandable to outsiders (Often) derived in a community process Ontology markup and representation languages: RDF and RDF Schema OWL Other: DAML+OIL , EER , UML , Topic Maps , MOF , XML Schemas The Semantic Web – Ontologies and Schemata
  • 12. Defines small vocabulary for RDF: Class, subClassOf, type Property, subPropertyOf domain, range Vocabulary can be used to define other vocabularies for your application domain The Semantic Web – RDF Schema Person Student Researcher subClassOf subClassOf Jeen type hasSuperVisor domain range Frank type hasSuperVisor
  • 13. OWL – The Web Ontology Language Owl took Christopher Robin’s notice from Rabbit and looked at it nervously. He could spell his own name WOL , and he could spell Tuesday so that you knew it wasn’t Wednesday, and he could read quite comfortably when you weren’t looking over his shoulder and saying &quot;Well?&quot; all the time... provides a vocabulary for defining classes, their properties and their relationships among classes. The Semantic Web – OWL owl :disjointWith s s s s Animal Herbivore Carnivore Omnivore Based on Description Logics OWL is a W3C Recommendation
  • 14. The Semantic Web – Applications Semantic Web cannot be and is not only a set of recommendations Semantic Web is becoming reality by applications that support it and are based on it Enabling technologies: RDF Storages: Sesame, Jena, YARS Reasoners: KAON, Racer Editors: Protege, SWOOP, MarcOnt Portal End-User applications: Semantic wikis: Makna, SemperWiki Semantic blogs Semantic digital libraries
  • 15. Outline Introduction to Semantic Web Semantic Digital Libraries
  • 16. What is a Semantic Digital Library? Semantic digital libraries integrate information based on different metadata, e.g.: resources, user profiles, bookmarks, taxonomies – high quality semantics = highly and meaningfully connected information provide interoperability with other systems (not only digital libraries) on either metadata or communication level or both – RDF as common denominator between digital libraries and other services delivering more robust, user friendly and adaptable search and browsing interfaces empowered by semantics
  • 17. Old days of hard-copy books Library: Archive (storage space) Bibliographic cards (metadata) Librarian (interface) Pros: Someone to talk to, to understand us, to explain, help in searching Cons: Based on physical location Libraries are not connected – we have to visit every place
  • 18. Yesterday of digital books Digital library Database and archive (storage) Digital bibliographic descriptions (metadata) Full-text search (interface) Pros: Content accessible online Federations of libraries – visit less places Cons: Lonely user - n o one to talk to, we need to find the right keywords, what if we do not know them (“man without an ear” paintings example) Still many problems with interconnecting (different) libraries
  • 19. Today of interconnected content Semantic Digital Libraries Database and archive (storage) Semantic bibliographic description (interconnected metadata) Search and browsing on ontologies (interface) Pros: Search and browsing based on semantics can help in substituting the librarian It is easier to interconnect heterogeneous libraries (RDF as common denominator) Cons: Semantics created from legacy formats – still hard to capture by most of average users
  • 20. Tomorrow of social media Social Semantic Digital Libraries Database and archive (storage) Bibliographic descriptions with annotations provided by users (metadata) Collaborative search and browsing (interface) Pros: Users contribute to the classification process Users can understand community driven annotations Users enhance digital content using blogs, wikis on the side Cons: Not everyone is convinced
  • 21. How are Semantic Digital Libraries different? Semantic digital libraries extend digital libraries by describing and exposing its resources in a machine ‘understandable’ way resources can be contents, digital artefacts organization of objects (e.g. collections) users, user communities controlled vocabularies, thesauri, taxonomies expose the semantics of their metadata in terms of an ontology defined using a formal language deliver mediation services for communication with other systems
  • 22. Semantic Web Technologies for Digital Libraries? Metadata is the key concept the Web does not have metadata the idea of a Semantic Web is nice but difficult to implement many digital libraries do have metadata in place we simply must make them available in a machine understandable format the Semantic Web provides the format: RDF
  • 23. Semantic Web Technologies for Digital Libraries? Knowledge in bibliographic records Digital Libraries already have controlled vocabularies, taxonomies or even ontologies in place the challenge is to model this knowledge in a machine understandable way the Semantic Web provides ontology language s: RDF Schema OWL SKOS
  • 24. A Sample Bibliographic Record Copyright 2000 The J. Paul Getty Trust & College Art Association, Inc . Terms taken from Controlled Vocabularies Vincent van Gogh; painter: Gogh, Vincent van (Dutch painter, 1853-1890) Creation-Creator/Role J. Paul Getty Museum Current Location-Repository Name irises , nature , soil , etc. Subject-Matter 1889, earliest: 1889, latest: 1889 Creation-Date Irises Title paintings Object/Work type Paintings Classification
  • 25. Knowledge Organization Systems tools that present the organized interpretation of knowledge structures semantic tools - meaning of words and other symbols as well as (semantic) relations between symbols and concept organize information and promote knowledge management Examples: classification and categorization schemata (organize materials at a general level) subject headings (provide more detailed access) authority files (control variant versions of key information such as geographic names and personal names) highly structured vocabularies, such as thesauri traditional schemes, such as semantic networks and ontologies
  • 26. Taxonomy of Knowledge Organization Systems Term Lists Authority files ( FOAF ) Glossaries Dictionaries Gazetteers Classifications and Categories ( DMoz ) Subject headings Classification schemes Taxonomies Categorization Schemes. Relationship Lists Thesauri ( WordNet, MeSH ) Semantic networks Ontologies (Hodge, 2000)
  • 27. Understanding Knowledge Organization Systems controlled vocabulary - a list of terms that have been enumerated explicitly taxonomy - a collection of controlled vocabulary terms organized into a hierarchical structure. formal ontology – a controlled vocabulary expressed in an ontology representation language. This language has a grammar for using vocabulary terms to express something meaningful within a specified domain of interest. meta-model - an explicit model of the constructs and rules needed to build specific models within a domain of interest. A valid meta-model is an ontology, but not all ontologies are modeled explicitly as meta-models. as a set of building blocks and rules used to build models as a model of a domain of interest, and as an instance of another model.
  • 28. Simple Knowledge Organization Systems (SKOS) basic structure and content of concept schemes such as thesauri, classification schemes, subject heading lists, taxonomies, 'folksonomies ', other types of controlled vocabulary core concepts: narrower and broader isSubjectOf and subject ; isPrimarySubjectOf and primarySubject member and Collection; memberList and OrderedCollection related and semanticRelation note, definition; altLabel and prefLabel ; symbol and altSymbol
  • 29. Benefits of Semantic Digital Libraries Problems of today’s libraries rapidly growing islands of highly organized information How to find things in a growing information space? is it enough to have a full-text index (à la Google)? typical “end-users” versus “expert users” converging digital library systems e.g. uniform access to Europe’s digital libraries and cultural heritage
  • 30. Benefits of Semantic Digital Libraries T he two main benefits of Semantic Digital Libraries new search paradigms for the information space Ontology - based search / facet search Community-enabled browsing providing interoperability on the data level integrating metadata from various heterogeneous sources Interconnecting different digital library systems
  • 31. Searching the Sample Bibliographic Record Full-text search “ Paintings ” AND “ Van Gogh ” AND “ flowers ”  no result Semantic query if the knowledge that “ irises ” are “ flowers ” is modeled in an ontology (e.g. subclass-hierarchy) we can query for all “ Paintings ” by “ Van Gogh ” with subject “ flowers ” and retrieve also the picture with subject “ irises ” Copyright 2000 The J. Paul Getty Trust & College Art Association, Inc . Vincent van Gogh; painter: Gogh, Vincent van (Dutch painter, 1853-1890) Creation-Creator/Role J. Paul Getty Museum Current Location-Repository Name irises , nature , soil , etc. Subject-Matter 1889, earliest: 1889, latest: 1889 Creation-Date Irises Title paintings Object/Work type Paintings Classification
  • 32. Semantic Digital Libraries and Existing DL Systems how to handle the legacy (meta-)data problem lifting existing (meta-)data to a semantic level simple solutions like MARC21  DublinCore complex ontologies like MarcOnt Ontology for capturing concepts from different standards legacy libraries expose their metadata via well established protocols - the metadata can be imported into semantic DLs semantic DLs can play a role of integration champions in the information retrieval process in heterogeneous networks: OAI-PMH Z39.50 Dienst
  • 33. Application A reas for Semantic Web T echnologies Thesauri & Controlled Vocabularies qualified DublinCore DMoz, DDC-based taxonomies SKOS, WordNet and other thesauri Schema Mappings / Crosswalks MarcOnt Ontology – aims to cover concepts from MARC21, BibTeX and DublinCore MarcOnt Mediation Services – an open mediation framework between common legacy metadata standards Metadata Integration RDF as a common data model for integrating metadata from various autonomous and heterogeneous data sources OWL for modeling the data source’s semantics SPARQL as a common query language
  • 34. Semantic DL as Evolving Knowledge Space In state-of-the-art digital libraries users are consumers Retrieve contents based on available bibliographic records Recent trends: user communities Connetea Flickr In Semantic digital libraries users are contributers as well Tagging (Web 2.0) Social Semantic Collaborative Filtering Annotations Semantic Digital libraries enforce the transition from a static information to a dynamic (collaborative) knowledge space
  • 35. Existing Semantic Digital Library Systems JeromeDL a social semantic digital library makes use of Semantic Web and Social Networking technologies to enhance both interoperability and usability BRICKS aims at establishing the organizational and technological foundations for a digital library network in order to share knowledge and resources in the cultural heritage domain. FEDORA delivers flexible service-oriented architecture to managing and delivering content in the form of digital objects SIMILE extends and laverages DSpace, seeking to enhance interoperability among digital assets, schemata, metadata, and services
  • 36. Tutorial – Semantic Digital Libraries - Existing Semantic Digital Libraries Solutions - Sebastian R. Kruk , Bernhard Haslhofer, Philipp Nußbaumer , Sandy Payette, Tomasz Woroniecki
  • 37. Existing Semantic Digital Library Systems JeromeDL a social semantic digital library makes use of Semantic Web and Social Networking technologies to enhance both interoperability and usability BRICKS aims at establishing the organizational and technological foundations for a digital library network in order to share knowledge and resources in the cultural heritage domain. FEDORA delivers flexible service-oriented architecture to managing and delivering content in the form of digital objects SIMILE extends and laverages DSpace, seeking to enhance interoperability among digital assets, schemata, metadata, and services
  • 38. Tutorial 7 – Semantic Digital Libraries - Existing Semantic Digital Libraries Solutions – JeromeDL Sebastian R. Kruk , Tomasz Woroniecki
  • 39. Outline JeromeDL - Motivation and Overview JeromeDL - Architecture and Ontologies JeromeDL - Semantic Services JeromeDL - Social Services JeromeDL - Semantics in Use
  • 40. JeromeDL - Introduction Joint effort of DERI, National University of Ireland, Galway and Gdansk University of Technology (GUT) Distributed under BSD Open Source license Digital library build on semantic web technologies to answer requirements from: librarians, scientists and everyone.
  • 41. Motivation How to integrate and search information from different bibliographic sources? How to share and interconnect knowledge among people?
  • 42. JeromeDL – Motivations Use Cases Librarians: support for rich metadata (MARC21) in uploading resources, accessing bibliographic information and searching persistent identifiers Scientists: easy publishing (designed as a institute/university digital library) creating hierarchical networks of digital libraries support for accessing, sharing and searching using bibliography metadata (BibTeX) Everyone: simple search (incl. natural language queries) community-aware information sharing and browsing, support for interationalization
  • 43. JeromeDL - Motivations Support for different kinds of bibliographic medatata, like: DublinCore , BibTeX and MARC21 at the same time. Making use of existing rich sources of bibliographic descriptions (like MARC21) created by human. Supporting users and communities: user s ha ve control over their profile information ; community-aware profiles are integrated with bibliographic descriptions support for community generated knowledge Delivering communication between instances: P2P mode for searching and users authentication Hierarchical mode for browsing
  • 44. Outline JeromeDL - Motivation and Overview JeromeDL - Architecture and Ontologies JeromeDL - Semantic Services JeromeDL - Social Services JeromeDL - Semantics in Use
  • 45. JeromeDL – Architecture Resources and annotations repository Middleware: query processing community space resources management User interface agents: Communication to the outside world Administrative interface
  • 46. Bibliographic Description in JeromeDL <?xml version =&quot;1.0&quot; encoding =&quot;UTF-8&quot; ?> <rdf:Description rdf:about =&quot;http://...id=828374765&quot; > <dc:title> JeromeDL - Adding Semantic Web Technologies to DLs </dc:title> <dc:creator> Sebastian Kruk </dc:creator> <dc:description> In recent years... </dc:description> </rdf:Description> 01450cas 922004331i 450000100...019c19329999gw qr|p| ||||0 |0ger | a0044-2992 9a200412140219bVLOADc200404071525dvkulc200310071018dvbjc200303101205dkopumky200209211341zVLOAD aGD U/MPcGD U/MPdGD U/MFdGD U/KKsdWR O/EJ0 ager1 aZ. Kunstgesch. 0aZeitschrift für Kunstgeschichte00aZeitschrift für Kunstgeschichte.18aZfK aMünchen ;aBerlin :bDeutscher Kunstverlag,c1932-. c26-29 cm. aKwart.0 a1 Bd. (Juni 1932)-. aOpis na podst.: LCC. aW 1932 założycielami czasopisma byli Wilhelm Waetzoldt i Ernst Gall.... These all can be represented in RDF @ InProceedings { jeromedexa2005, author = &quot;Sebastian Ryszard Kruk and ... &quot;, title = &quot;{JeromeDL - Adding Semantic ...}&quot;, booktitle = &quot;{In Proceedings to DEXA 2005}&quot;, year = 2005}
  • 51. Metadata and Services in JeromeDL
  • 52. Outline JeromeDL - Motivation and Overview JeromeDL - Architecture and Ontologies JeromeDL - Semantic Services JeromeDL - Social Services JeromeDL - Semantics in Use
  • 53. Semantic Metadata and Services
  • 54. MarcOnt Initiative – Overview Motivation: Provide set of tools for collaborative ontology development MarcOnt Initiative goals: Create a framework for collaborative ontology improvement (E-learning) Provide domain experts with tools to share their knowledge Offer tools for data mediation between different data formats
  • 55. MarcOnt Portal and MarcOnt Ontology MarcOnt Ontology: Central point of MarcOnt Initiative Translation and mediation format Continuos collaborative ontology improvement Knowledge from the domain experts MarcOnt Portal (source of knowledge): Suggestions Annotations Versioning Ontology editor
  • 56. MarcOnt Mediation Services for Legacy Metadata Format translation RDF Translator Format co-operation MarcOnt Mediation Services
  • 57. Browsing the data graph – why? The search does not end on a (long) list of results The results are not a list (!) but a graph „ Lost in hyperspace” A need for unified UI and services for filter/narrow and browse/expand services Share browsing experience – navigate collaboratively
  • 58. Browsing the data graph – how? Defines REST access to services and their composition Basic services: access, search, filter, similar, browse, combine Meta services : RDF serialization, subscription channels, service ID generation , Context services : manage contexts, manage service calls/compositions in the context, lists contexts Statistics services : properties, values, tokens
  • 59. Browsing the data graph JeromeDL exploits interconnected data
  • 60. Browsing the data graph … to allow browsing
  • 61. Outline JeromeDL - Motivation and Overview JeromeDL - Architecture and Ontologies JeromeDL - Semantic Services JeromeDL - Social Services JeromeDL - Semantics in Use
  • 62. Semantic Metadata and Services
  • 63. Social Services in JeromeDL Involve users into sharing knowledge Blogs – comments and discussions about documents and resources Tagging – collaborative classification Wikis – collaboratively edited additional descriptions, such as summaries and interesting facts Preserve knowledge for future use Users can learn from experience of others instantly Recommend new, interesting resources based on users’ profiles
  • 64. FOAF - Describing Social Networks FOAF - Stands for Friend-of-a-Friend Defines properties for a person (but it does not have to be a person, can be an “agent”) Does not only have to contain one person per file Can build a network of people with foaf:knows links FOAF can be easily extended to meet requirements, as in the case of FOAFRealm for identity management…
  • 65. Identity management with FOAFRealm Identity defined with extended FOAF metadata Policies expressed by social networking Distance between owner and requester Friendship level between owner and requester, calculated across digraph of social network Support for single registration and sign on Distributed identity management with HyperCuP (“D-FOAF”) FOAFRealm is currently implemented as a plugin for Tomcat (Realm/Valve implementation), with PHP and .NET versions coming soon
  • 66. Social Semantic Collaborative Filtering Why? The bottom-line of acquiring knowledge: informal communication (“word of mouth”) How? Everyone classifies (filters) the information in bookmark folders ( user-oriented taxonomy ) Peers share (collaborate over) the information ( community-driven taxonomy ) Result? Knowledge “flows“ from the expert through the social network to the user System amass a lot of information on user/community profile (context)
  • 67. Social Semantic Collaborative Filtering Problems? The horizon of a social network (2-3 degrees of separation) How to handle fine-grained information (blogs, wikis, etc.) Solutions? Inference engine to suggest knowledge from the outskirts of the social network Support for SIOC metadata : SIOC browser in SSCF Annotations and evaluations of “local” resources
  • 68. What is S ocial S emantic C ollaborative F iltering? Goal: t o enhance individual bookmarks with shared knowledge within a community Users annotate catalogues of bookmarks with semantic information taken from DM oz or WordNet vocabularies Catalogs can include ( transclusion ) friend's catalogues Access to catalogues can be restricted with social networking-based polices SSCF delivers: Community-oriented, semantically-rich taxonomies Information about a user's interest Flows of expertise from the domain expert Recommendations based on users previous actions Support for SIOC metadata
  • 69. Social Semantic Collaborative Filtering foaf:knows xfoaf:include xfoaf:bookmark
  • 70. Social Networks in Digital Libraries Resource xfoaf:Annotation user_C creator_B foaf:knows marcont:hasCreator creator_A foaf:knows foaf:knows xfoaf:Directory user_D xfoaf:owns xfoaf:linksTo xfoaf:isIn
  • 71. Support for online communities in SSCF
  • 72. Support for online communities in SSCF
  • 73. Outline JeromeDL - Motivation and Overview JeromeDL - Architecture and Ontologies JeromeDL - Semantic Services JeromeDL - Social Services JeromeDL - Semantics in Use
  • 74. JeromeDL – Delivering Semantic Content Providing semantic annotations during uploading process: open module for handling any taxonomies keywords based on WordNet and free tagging defining structure of resources in the JeromeDL ontology Lifting legacy metadata to MarcOnt ontology Community maintained annotations social semantic collaborative filtering semantic descriptions based on the FOAF metadata
  • 76. JeromeDL – Semantic Information In Use Searching: Keyword-based search with semantic query expansion Semantic search: Direct RDF quering Natural language templates Browsing Exibit MultiBeeBrowse Sharing: Social Semantic Collaborative Filtering Semantically Interlinked Online Communities Heterogeneous communication: Bibster , A9 , OAI -PMH
  • 80. Information Retrieval in JeromeDL Fulltext Index Structure Repository MarcOnt Repository Resources’ Content FOAFRealm Repository (typed) keywords RDF & NL Query OpenSearch RSS collaborative filtering types translation semantic query expansion RDF Repositories Secure Snapshot local interface distributed interface
  • 81. Networks of Digital Libraries ELP (Extensible Library Protocol) implementation communication within JeromeDL network adapters for communication with other networks D-FOAF integration (distributed user profile management) single sign on and single registration within D-FOAF network HyperCuP integration (scalable P2P network) Independent ELP network entry point: http://search.jeromedl.org/ 0 0 1 1 0 0 1 1 0 2 2 2 2
  • 82. Tutorial – Semantic Digital Libraries - Existing Semantic Digital Libraries Solutions – BRICKS Bernhard Haslhofer University of Vienna Austria Philipp Nußbaumer Research Studios Austria
  • 83. Outline BRICKS Overview BRICKS Components BRICKS Applications
  • 84. What is BRICKS? A software infrastructure for building digital library networks Transparent access to distributed resources Multilinguality Easy installation & maintainance A set of end-user applications Network & content management Web 2.0 tagging/annotations Domain specific applications A business model Open source, platform independent Low cost infrastructure User communities  sustainability
  • 85. BRICKS Architecture A decentralized P2P network Avoid central coordination Highly Scalable, increased reliability Minimized maintainance costs Each P2P Node is a set of SOA components Web Service interface Platform independent Flexible composition Components for Storing, accessing and protecting digital objects (Semantic) search & browsing P2P commmunication
  • 87. A Look into a BNode { BNode
  • 88. Outline BRICKS Overview BRICKS Components BRICKS Applications
  • 89. Collection Manager Single access point for all content and metadata related operations (local and remote) Physical Collection Similar to folder/directory hierarchy in a file system Bound to a single BNode Each digital content object belongs to exactly one collection Logical Collection Virtual folder for organizing content items independent of their physical location Links to content items from various physical collections on different BNodes A content item might belong to many of them Stored Query similar to database views
  • 90. Content Manager Two ways to handle content in BRICKS Stored locally at site of a member party, accessed via URL Stored within BRICKS Based on Java Content Repository (JCR) Provides a meta-content model Re-use of existing content models Use standard models
  • 91. Metadata Manager Metadata descriptions  RDF Suitable for any application scenario Express relationships between objects React to changes without changing the model Schema defintions  OWL No fixed schema Extensible (e.g. Application profiles) Semantic concepts instead of schematic strucutures SPARQL Metadata queries over ontology concepts Queries for graph patterns
  • 92. Security Manager Transparently invoked by the Framework any service call is checked Context-aware policies based on RBAC (via XACML rules) supporting Roles, Groups, at DLObject level Permission declaration through Javadoc @tags Federated identity is managed through an adapted version of OpenSAML Reputation-based Trust calculation integrated Web-based GUI for security configuration
  • 93. Digital Rights Management DRM Component Support for licenses based on MPEG-21 REL license declaration standard Generic API for the integration of commercial DRM systems Watermarking Open-source watermarking tool for images Other tools can be integrated BRICKS Store web application for commercial content Creative Commons support for other content in BRICKS
  • 94. Outline BRICKS Overview BRICKS Components BRICKS Applications
  • 95. Application: BRICKS Workspace What does it demonstrate? A web application (thin client) accessing BRICKS Foundation services Web 2.0 image annotations Reference application Primary customers General end-users (citizens) Application developers Technology Struts based interface to the BCH
  • 96. Application: BRICKS Desktop What does it demonstrate? A rich client application accessing BRICKS foundation services Direct access to the BCHN Primary customers Expert end-users (researchers, educators) Application developers Technology Eclipse based rich client interface
  • 97. Application: Annotation Tool What does it demonstrate? Tool which allows end-users to annotate images Creation of annotation threads Supervised Annotations Primary customers End-users Institutions with large image collections Technology Web Application
  • 98. Application: Online Exhibition Authoring Tool What does it demonstrate? Creating and publishing online exhibitions using contents that is available in the BRICKS network Primary customers? Expert end-users (curators) Technology Web Application
  • 99. Application: Archeological Finds Identifier What does it demonstrate? A web application for comparing findings (e.g. ancient coins) with objects in reference collections Application of complex domain ontology (CIDOC-CRM) Map visualization of GIS-Metadata Primary customers? Museum curators, archaeologists, students, amateurs, Technology Struts based interface
  • 100. References BRICKS Community Web Site http://www.brickscommunity.org/ Main Contact: [email protected] Related (de-facto) standards Resource Description Framework (RDF) http://www.w3.org/TR/rdf-primer/ OWL Web Ontology Language (OWL) http://www.w3.org/TR/owl-guide/ SPARQL http://www.w3.org/TR/rdf-sparql-query/ Java Content Repository (JCR) http://www.jcp.org/en/jsr/detail?id=170 Tools and Libraries Jackrabbit http://jackrabbit.apache.org/ Jena Semantic Web Framework http://jena.sourceforge.net/
  • 101. Tutorial – Semantic Digital Libraries - Existing Semantic Digital Libraries Solutions – Fedora Sandy Payette Director, Fedora Project Cornell University
  • 102. Outline Fedora Examples: PLoS ONE and National Science Digital Library
  • 103. Fedora Semantic Digital Libraries enable … Scholarly and Scientific Workbenches “ Web 2.0” Collaborative Repositories Museum Exhibits with Lesson Plans Linking Data and Publications blog and wiki
  • 104. The Fedora Project Fedora F lexible E xtensible D igital O bject R epository A rchitecture History Cornell Research (1997-2002) DARPA and NSF-funded research and reference implementations Distributed, Interoperable Repositories (experiments with CNRI) Open Source Project (2002-present) Andrew W. Mellon Foundation (2002-2009) Joint development by Cornell University and University of Virginia Transitioning into non-profit organization (Fedora Commons 501c3)
  • 107. Fedora - Technology Integration Semantic Repository Enterprise Preservation Information Networks Contextualization Relationships Query Inference Workflow Messaging Transactions Replication Digital Objects Manage Access Versioning Storage Integrity Check Monitoring Alerting Migration
  • 108. Fedora Digital Objects Flexible object model can support Documents, articles, journals Electronic Scholarly Texts Digital Images Complex multimedia publications Datasets Metadata Learning objects More… Create “networks” of objects using RDF Define object relationships and other properties via RDF Collection/member; part/whole; etc.
  • 109. RDF in the Fedora Digital Object Model
  • 110. Motivations: Fedora and Semantic Technologies A natural model for exposing repository as network of objects Object-to-object relationships Relationships to external entities Query the graph; traversal to discover related stuff Indexing based on generalizable data model Graph-based data model is a common reduction Avoid fixed schema problems and metadata mud wrestling Extensible enrichment of object descriptions Keep overlaying statements from multiple ontologies Organic evolution Powerful queries and inference for repository management Transitive relationships among objects Dependency analysis; Detection/Extraction of sub-graphs Provenance of disseminations
  • 111. Digital Objects contain their RDF assertions Assert relationships from Fedora base ontology Collection – member Whole – part Equivalence Description Of More… Assert relationships/properties from community ontologies isAnnotationOf isRecommendedBy isCertifiedBy More ….
  • 112. Example: Digital Objects with “compositional semantics”
  • 113. Use Case: scholarly objects and annotation in the humanities musuem and library objects commercial web content scholarly objects URI-100 xx:recommends URI-55 yy:certifies
  • 114. 3 Objects – 3 RDF “Relationships” Datastreams <rdf:Description rdf:about=&quot;info:fedora/uva:pid-11> <ais:annotationOf rdf:resource=“info:fedora/uva:pid-3”/> </rdf:Description> </rdf:RDF> <rdf:Description rdf:about=&quot;info:fedora/uva:pid-3&quot;> <uva:hasPartLetter rdf:resource=&quot;info:fedora/uva:pid-2&quot;/> <uva:hasPartDiagram rdf:resource=&quot;info:fedora/uva:pid-1&quot;/> </rdf:Description> </rdf:RDF> <rdf:Description rdf:about=&quot;info:fedora/uva:pid-10> <ais:providesContextFor rdf:resource=“info:fedora/uva:pid-3”/> </rdf:Description> </rdf:RDF>
  • 115. NOT the core object store - RI is a graph-based index of the repository Automatic, incremental indexing into triplestore Search/query the repository via Fedora RI Query Interface Fedora RDF-based Resource Index (RI) RDF Index of Repository RDF datastream Fedora object properties DC datastream Digital Object Store
  • 116. RI Graph - view 1 (abbreviated) …
  • 117. RI Graph - view 2 (abbreviated) …
  • 118. RI Implementation: The Triplestore Challenge Scalability Few triplestores perform well for 100M+ triples Kowari – we tested to 180M triples MPTStore – we tested to 250M triples Performance Jena - easy to get out of memory Sesame Native - slow for complex queries Kowari Fast queries and full-featured query language (iTQL) Instability and corruption problems MPTStore Very fast for SPO queries (limited support for complex queries) Add/modify significantly faster than Kowari Mulgara Fork of Kowari; complex queries; models; inference Major bug fixes to fix stability and corruption problems XA2 transactions Claims support for billions of triples
  • 119. Fedora Repository – Notable Features Generic Digital Object Model Automatic content versioning and audit trail Web Service Interfaces (REST and SOAP) Authentication Authorization Flexible fine-grained policy enforcement Built-in support for Extensible Access Control Markup Language (XACML) RDF Each object contains its own RDF assertions Repository-wide index of all object (RDF triplestore) Self-healing – rebuild repository via digital object source files
  • 120. Outline FEDORA Examples: PLoS ONE and National Science Digital Library
  • 121. PLoS ONE and Topaz Open Access Publishing and Collaboration
  • 122. NSDL: Semantic Digital Library Architecture NDR
  • 123. What is NSDL committed to? NSDL 2.0 as a platform for a collaborative, contributory semantic digital library Supporting communities across the full range of science, technology, engineering and mathematics research, learning and education Supporting the creation of context around library resources to enhance discovery, use, and understanding
  • 124. NSDL Semantic Digital Library repository requirements Supports storing both content and metadata Allows arbitrary relationships among resource and metadata objects: organization, annotation, citation Accessible through web service architecture of remixable data sources and transformations
  • 125. NSDL Data Repository (NDR) Implemented in Fedora 2.2 with MPTStore Moderately large 4.7 million digital objects 250 million RDF triples Digital Objects Resources Metadata Agents Metadata providers Aggregators REST API and authentication In production at nsdl.org
  • 126. NSDL as Semantic Digital Library : collaboration, context, and contribution Platform: Fedora repository and services Applications: Solution 1: Leverage the existing successful models: blogs, wikis, bookmarking/tagging Solution 2: Leverage the existing software: WordPress, MediaWiki, Connotea, Sakai Solution 3: Engage with partners and the broader community to build applications to the platform
  • 127. Expert Voices - Blogs on top of Fedora
  • 128. Expert Voices NSDL Blogosphere (http://expertvoices.nsdl.org) Topic-based discussions (e.g. forensics) linked to related library resources A way for NSDL community members to become NSDL contributors of resources, questions, reviews, annotations, metadata Technology: Wordpress-based multi-user multi-blog application (open source, plug-in architecture) Owner controls publication of entries as NSDL resources and visibility of comments (NSDL middleware and Shibboleth) Blog Entries: linked references to NSDL library resources
  • 129.  
  • 130.  
  • 131.  
  • 132. NSDL 2.0 – The Whole Ecosystem … Protocol: OAI-PMH HTTP REST NDR API STEM Collections Search Service Archive Service Fedora-based NDR
  • 133. NSDL 2.0 and the Semantic Web NSDL 2.0 applications situate resources in context, aiding both discovery and use Users become contributors, adding new resources, ratings, annotations, and organizational structure – frequently as a side effect of using the library Fedora-based semantic web technology organizes resources, ties context to content, maintains provenance, enables discovery, empowers the user, and powers the library
  • 134. Fedora Web Site: www.fedora.info Community Open Source Tools: www.fedora.info/tools Fedora Wiki: www.fedora.info/wiki Tutorial: : http://openarchives.org/fedora/ESWC-Fedora.zip
  • 135. Tutorial – Semantic Digital Libraries - Comparison and the Future - Sebastian R. Kruk , Bernhard Haslhofer, Philipp Nußbaumer , Sandy Payette, Tomasz Woroniecki
  • 136. Outline SIMILE – short overview Comparison between existing solutions Digital Libraries and Social Web Semantic Digital Libraries Scenarios
  • 137. SIMILE – Introduction SIMILE - Semantic Interoperability of Metadata and Information in unLike Environments joint project conducted by the W3C, HP, MIT Libraries, and MIT's Lab for Computer Science. extends and laverages DSpace, seeking to enhance interoperability among digital assets, schemata, metadata, and services Goal: Make metadata interoperability easier for digital libraries by providing useful tools for browsing, searching and mapping heterogeneous metadata in RDF [ MacKenzie Smith, MIT Libraries ]
  • 138. SIMILE – Introduction SIMILE : enhances interoperability and provides end-user services: for digital assets, arbitrary schemata, metadata and services. across distributed individual, community, and institutional stores. though the application of RDF and semantic web techniques. implements a digital asset dissemination architecture based upon web standards
  • 139. SIMILE – Delivered Components Tools for Metadata Managers Gadget - XML inspector RDFizers - Batch tools to transform existing XML data into RDF Solvent - Firefox extension for Javascript screen scraping Welkin - Graphical tool to inspect/edit RDF graph Tools for End-Users Longwell - Web-based RDF faceted metadata browser Piggy Bank - Firefox extension for personal information management of metadata in RDF Semantic Bank - Web-based server that allows data publishing and sharing by individuals, groups, or communities Exibit - lightweight structured data publishing framework Timeline - AJAXy widget for visualizing time-based events
  • 140. RDFizers - T ransform XML data into RDF RDFizers - Transform XML data into RDF: tools that allow to transform existing data into an RDF representation List of RDFizers in SIMILE: MARC/MODS  RDF OAI-PMH  RDF OCW  RDF EMail  RDF BibTEX  RDF Flat  RDF Weather  RDF Java  RDF Javadoc  RDF Jira  RDF Subversion  RDF Random  RDF
  • 141. Solvent - Java S cript screen scraping Solvent - JavaScript screen scraping: a Firefox extension that helps write Javascript screen scrapers for Piggy Bank. Motivation: Piggy Bank needs web pages to embed information in RDF . Unfortunately , not many web pages embed or link to RDF information . Piggy Bank is capable to execute a particular screen scraper on particular pages in order to &quot;extract&quot; the information it needs. turn s a regular web page into a semantic web page, freeing the data from the page/site that contains it.
  • 142. Solvent - Java S cript screen scraping
  • 143. Longwell - RDF faceted metadata browser
  • 144. PiggyBank Firefox extension for managing metadata - Loads RDF into local Longwell server Search and faceted browse of local RDF - Views defined by library, other users Users can find, collect, annotate RDF - Can then publish for access by others
  • 146. SemanticBank Semantic Bank use cases: persist information remotely on a server share information with other people lets you publish your information, both in RDF or to regular web pages f or individuals, groups, communities - e.g. conference proceedings t he ability to tag resources creates a powerful serendipitous categorization Longwell facetted browsing view of published information
  • 148. Exibit
  • 149. Outline SIMILE – short overview Comparison between existing solutions Digital Libraries and Social Web Semantic Digital Libraries Scenarios
  • 150. System Features Comparison General Properties JeromeDL BRICKS Fedora OS Support Any Any Any Hardware Requirements 500MB RAM, min 128MB HD 500MB RAM, min 100MB HD 500MB RAM, min 100MB HD Software Requirements Java 1.5, Tomcat 5.5, Sesame Java 1.4/1.5, Jena Java 1.5, Tomcat, Kowari/Mulgara or MPTStore Current Stage Research Stable version 2.0.1 Second Prototype Production Version 2.2 No. Installations 12+ ~ 8 ~50 monitored; large # of downloads unmonitored Support Model Open Source Open Source Open Source
  • 151. System Features Comparison Architectural Aspects JeromeDL BRICKS Fedora Distribution Distributed searching (P2P), aggregated browsing (hierarchical) Fully decentralized (P2P) federation via nameresolver search services; Alvis P2P Architecture Granularity Low (main building blocks) High (many Components) High (core repository service with configurable modules; loosely coupled services) DB - Support Any Sesame-compliant backend Any Jena compliant backend MySQL, Postgres, Oracle, McKoi; Kowari/Mulgara
  • 152. System Features Comparison Content & Metadata Aspects JeromeDL BRICKS Fedora Content Types All All All Content Models JeromeDL ontology Any Any Metadata Schema MarcOnt + extensions Any RDF/S & OWL schema Any XML Schema, RDF/S & OWL schema Query types Full-text, Filed-Search, Ontology-based, NL Query Templates Full-text, Field-Search, Ontology-based (sparql) Field Search, Ontology-based (itql, rdql, sparql, spo), Full-Text (Lucene or Zebra backed service)
  • 153. System Features Comparison Security & DRM Aspects JeromeDL BRICKS Fedora Security Model FOAFRealm RBAC XACML Policy Granularity Resource Component, Method, Object Object, Datastream, Dissemination method DRM Model Fair use DRM under development MPEG-21 REL DRM Datastreams DRM Enabling Tool Support Watermarking
  • 154. System Features Comparison Semantic Aspects & Community Features JeromeDL BRICKS Fedora Reasoning Recommendation engine based on Prolog Configurable inference engine Holding pattern; look to Mulgara; Tagging Free tagging, Wordnet-based Annotation middleware component middleware/apps (e.g., NSDL/NDR; PLoSONE/Topaz) Taxonomies Any (JOnto) Any Any Knowledge Sharing SSCF component via middleware upon BRICKS via middleware upon Fedora Communities SIOC and FOAF compli a nce
  • 155. Outline SIMILE – short overview Comparison between existing solutions Digital Libraries and Social Web Semantic Digital Libraries Scenarios
  • 156. The future - Social Semantic Digital Libraries Why current (semantic) digital libraries are not enough? digital libraries should not be for librarians only but for average people they concentrate on delivering content/information, not on knowledge sharing within a community of users digital libraries have lost human-part of their predecessors
  • 157. The future - Social Semantic Digital Libraries What could be the solution? make users/readers involved in the content annotation process allow users/readers to share their knowledge within a community provide better communication between users in and across communities
  • 158. The future - Social Semantic Digital Libraries What is Web 2.0? The Web where “ordinary” users can meet, collaborate, and share using whatever is newly popular on the Web (tagged content, social bookmarking, AJAX, etc.) The term Web 2.0 was made popular by Tim O’Reilly: http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html Popular examples include: Bebo, del.icio.us, digg, Flickr, Google Maps, Skype, Technorati, Wikipedia…
  • 159. The future - Social Semantic Digital Libraries (3) Web 2.0 focuses include: The Web as a platform for social and collaborative exchange Reusable community contributions Subscriptions to information, news, data flows, services Mass-publishing using web-based social software Social software for communication and collaboration: IM, IRC, Forums, Blogs, Wikis, Social Network Services, Social Bookmarks, MMOGs…
  • 161. Comparing Web 1.0 / Web 2.0 / Semantic Web 2.0 Semantic Social Networks Online Social Networks Buddy Lists, Address Books Semantic Social Information Spaces - - Social Semantic Digital Libraries Google Scholar, Book Search CiteSeer, Project Gutenberg Semantic Forums and Community Portals Community Portals Message Boards Semantic Blogs Blogs Personal Websites Semantic Search Google Personalised, DumbFind Altavista, Google Semantic Wikis Wikis Content Management Systems Semantic Web 2.0 Web 2.0 Web 1.0
  • 162. Outline SIMILE – short overview Comparison between existing solutions Digital Libraries and Social Web Semantic Digital Libraries Scenarios
  • 163. Geo, Time, and Machine Tagging Geo-tagging for resources with a specific geographical location Time-tagging – community driven process of assigning auxiliary multimedia content Machine-tagging – ability to mix structured annotations into tags ROI-tagging : Regions of interest ERP game Asynchonous version with annealing of annotations for less frequently visited libraries
  • 164. SDL in eLearning One of potential sources of future e-Learning systems On the verge between formal (libraries) and informal (communities) learning sources Semantic interoperability with Learning Management Systems Improve knowledge creation, delivery and sharing
  • 165. SDL in Future Museums Museums have physical objects Should bind digital annotations with physical objects Real-virtual tours Start with real, guided tour Ubiquitous browse through context information Locate other exhibitions in the vicinity Share your knowledge and experience with others, leave bread-crumbs for others Get the most of the exhibition during your visit
  • 166. Discussion – Feedback The Librarian from Unseen University in Ankh-Morpork (formerly Dr. Horace Worblehat)