Call for Papers

Most of the knowledge available on the Web is present as natural language text enclosed in Web documents aimed at human consumption. A common approach for obtaining programmatic access to such knowledge uses information extraction techniques in order to reduce texts written in natural languages to machine readable structures, from which it is possible to retrieve entities and relations, obtain answers to database-style queries, etc.

The Natural Language Processing (NLP) community has been addressing this crucial task for the past few decades. As a result, the community has established gold standards for various tasks, and metrics to evaluate the performances of algorithms in important tasks such as Co-reference Resolution, Named Entity Recognition, Entity Linking and Relationship Extraction, just to cite a few examples. Scientific evaluation campaigns, starting in 2003 with CoNLL, ACE (2005, 2007), TAC (2009, 2010, 2011, 2012), and ETAPE in 2012 were proposed to compare the performance of various systems in a rigorous and reproducible manner.

Some of these topics overlap with research that the Database Systems community has been addressing also for decades, such as Identity Resolution (Deduplication, Entity Resolution, Record Linkage, etc), Schema Mapping (Schema Mediation, Ontology Matching, etc.) and Data Fusion. 

Meanwhile the Semantic Web and Linked Data communities have been addressing questions related to how to model, serialize and share such information on the Web, as well as on how to use knowledge described in more expressive formalisms for a variety of integration, retrieval and discovery tasks. Similarly, the Information Retrieval community has been increasingly paying attention to the intersection of structured and unstructured data, with topics encompassing Entity-Oriented Search (cf. TREC Entity, KBA), Semantic Search, etc.

Industry-backed efforts such as Schema.org, have also highlighted the Web industry’s interest in the general topics discussed in our workshop, echoed also by recent efforts such as Google’s Knowledge Graph, Yahoo!’s Web of Objects, Walmart Lab’s Social Genome and Microsoft's Satori Graph.

The World Wide Web conference offers an ideal forum to discuss the intersection of those areas. The Web offers a vast amount of unstructured content from which to discover or better understand entities, while also serving as the largest knowledge graph on which to ground content. Moreover, it offers a broad range of relationships that already exist among entities. Our goal is to bring together research and expertise from different communities such as Information Extraction, Natural Language Processing, Database Systems and Semantic Web to realize and evaluate the impact of the vision of a Web of Linked Entities.


Submissions to the workshop should cover at least one of the following:

  1. Improvements upon the state of the art in NLP using information in the Web of Data (e.g. the Linked Open Data (LOD) cloud);
  2. Use of NLP techniques to improve knowledge extraction, integration and discovery in the context of the Web of Data.
  3. Knowledge extraction/retrieval from text and HTML documents (or other structured and semi-structured documents) on the Web; especially focusing on scalability, evaluation of precision & recall and/or live systems;
  4. Novel applications to search and browse the WWW with the help of extracted knowledge and the Web of Data.
  5. Methods with a special focus on Big Data due to volume (extremely large), variety (extremely heterogeneous) or speed (streaming at a fast pace).
  6. Innovative applications showing the impact of the Web of Linked Entities vision on problems/solutions affecting the local community,  e.g. detecting corruption, tracking criminality, facilitating access to education or health services, helping to search for the cure for neglected diseases, promoting citizen participation on the government, improving tourism-related services, etc.

Other topics of interest include:

  • Text and web mining
  • Pattern and semantic analysis of natural language, reading the web, learning by reading
  • Large-scale information extraction 
  • Usage mining
  • Entity resolution and automatic discovery of entities
  • Frequent pattern analysis of entities and/or relationships
  • Entity linking, named entity disambiguation, cross-document co-reference resolution
  • Ontology representation of natural language text
  • Analysis of ontology models for natural language text
  • Learning and refinement of ontologies
  • Natural language taxonomies modeled to Semantic Web ontologies
  • Disambiguation with the support of knowledge bases
  • Multilingual information extraction
  • Use cases of entity recognition for Linked Data applications
  • Relationship extraction, slot filling
  • Impact of entity linking on information retrieval, semantic search, entity oriented search
  • Semantic relatedness and similarity using entities and relations


More information:


The best paper will be given a "best paper award" certificate and an iPad2 16GB.

Challenge winners will be awarded iPads2 16GB.

Important Dates

  • Feb 25th 2013  Mar 1st 2013: research paper submissions
  • Mar 13th Mar 19th, 2013: research paper notifications
  • March 27th April 1st, 2013: camera-ready research paper
  • April 10th April 24th 2013: challenge submissions
  • May 13th 2013 : WoLE2013 workshop day
All deadlines are 23:59 Hawaii Time.