The parser provided by CodeOntology allows to analyze Java source code and generate RDF triples. It is available on GitHub, along with a tutorial on how to use it to analyze different kinds of Java projects.

The following image shows an excerpt of the output of the parser, when it is applied to a simple Hello World program. For more details, see the documentation and the query examples.

The parser has been applied to extract a knowledge base from the OpenJDK 8 source code. Here are some details about the extracted RDF triples. As we can see, most of the triples generated by the parser are about structural information common to all object-oriented programming languages, like class hierarchy, methods and constructors.

  • Structural information on source code: 1981108 triples
  • DBpedia links: 309688 triples
  • Actual source code as literals: 134757 triples
  • Literal Comments: 105881 triples

This breakdown of the dataset extracted from OpenJDK is available for download on Zenodo. Furthermore, it can be queried directly through our remote SPARQL endpoint.