Introduction to Apache Any23
Library
Anything To Triples (any23) is a library, a web service and a command line tool that extracts structured data in RDF format from a variety of Web documents. Currently it supports the following input formats:
- RDF/XML, Turtle, Notation 3
- RDFa with RDFa1.1 prefix mechanism
- Microformats: Adr, Geo, hCalendar, hCard, hListing, hResume, hReview, License, XFN and Species
- HTML5 Microdata: (such as Schema.org)
- CSV: Comma Separated Values with separator autodetection.
A detailed description of available extractors is here.
Apache Any23 is used in major Web of Data applications such as sindice.com and sig.ma. It is written in Java and licensed under the Apache License v2.0. Apache Any23 can be used in various ways:
You can download the latest release from our download page.
Documentation Content
Introduction: this page.
Install: how to install Apache Any23 library and service.
Getting Started: start using Apache Any23 command-line tools.
Supported Formats: complete list of Semantic Web formats supported by Apache Any23.
Configuration: learn how to change default library and service configuration.
REST Service: discover how to use the Apache Any23 REST Service.
Plugins: read how to install and configure the Apache Any23 plugins.
Developers: understand the Apache Any23 code internals, how to write plugins, fixing rules and customize the code.
Community
Questions, comments? Get in touch on the mailing list! Bugs, feature requests, patches? Please submit to the Jira issue tracker. You can access the source through Subversion, see the Installation Guide for details.
Acknowledgements
The original code base comes from open-sourcing the "RDFizer" component of the Sindice search engine. The project is supported by DERI, NUI Galway, Web of Data - FBK and the OKKAM project (ICT-215032). Individual developers who have contributed to any23 include: Michele Catasta, Richard Cyganiak, Michele Mostarda, Davide Palmisano, Gabriele Renzi, Juergen Umbrich.