SCM2DB is a tool that converts historical information found in a Source Code Management tool (currently only Subversion) into a Neo4j graph or PostgreSQL tables.

Research Code Disclaimer

This is research code, not production code. We have tested it on our systems, using our data. We are confident that it works correctly for us, but make no guarantees of suitability in other environments. Your mileage may vary.

Additionally, most of our work recently has been with Neo4j. The functionality for PostgreSQL is getting a little long in the tooth, and may or may not yield appropriate results. The Neo4j functionality works great.

Despite our disclaimer, you are welcome to use the SCM2DB.jar however you like.


The only real requirement is Java 7. This software WILL NOT work on previous versions of Java. If you see an error message like the following, your version of Java is too old:

Exception in thread "main" java.lang.UnsupportedClassVersionError: edu/byu/cs/sequoia/scmimporter/SCM2DB : Unsupported major.minor version 51.0
	at java.lang.ClassLoader.defineClass1(Native Method)
	at java.lang.ClassLoader.defineClass(
	at Method)
	at java.lang.ClassLoader.loadClass(
	at sun.misc.Launcher$AppClassLoader.loadClass(
	at java.lang.ClassLoader.loadClass(
Could not find the main class: edu.byu.cs.sequoia.scmimporter.SCM2DB. Program will exit.

However, to make full use of this tool on a large data set like the Apache Subversion repository, you will need a lot of memory. For some of our analyses we used as much as 60 gig.


This software is released under the GNU General Public License (GPL).

Using the Script

To get a list of available commands, run

java -jar SCM2DB.jar --help

This should output the following:

Build date: Mar 27, 2013 3:42:03 PM
Version:    0.1.0

try: help <command-name>
Available commands
	Command     Description
	help        Print available commands. Also, typing help <command> will print help for the specified command.
	pgsql       Import a SCM into a PostgreSQL database.
	neo4j       Import a SCM into a Neo4j database.
	metrics     Generate 2 files. The first contains comparisons between communities based on --comparison-attribute and the communities found through combinations of --edge-metric and --file-metric values as well as whole-graph community finding (unless disabled in --edge-metric).
	commdist    Write a file that represents the distribution of community sizes in a graph for one or more clusterings.

Creating Graphs

To create a graph from a Subversion repository:
java -jar SCM2DB.jar neo4j --db-path /path/to/new/db --scm-url file:///path/to/scm


For more information, see the Javadocs.


Name Last Modified Size