SCM2DB is a tool that converts historical information found in a Source Code Management tool (currently only Subversion) into a Neo4j graph or PostgreSQL tables.
Research Code Disclaimer
This is research code, not production code. We have tested it on our systems, using our data. We are confident that it works correctly for us, but make no guarantees of suitability in other environments. Your mileage may vary.
Additionally, most of our work recently has been with Neo4j. The functionality for PostgreSQL is getting a little long in the tooth, and may or may not yield appropriate results. The Neo4j functionality works great.
Despite our disclaimer, you are welcome to use the SCM2DB.jar however you like.
The only real requirement is Java 7. This software WILL NOT work on previous versions of Java. If you see an error message like the following, your version of Java is too old:
Exception in thread "main" java.lang.UnsupportedClassVersionError: edu/byu/cs/sequoia/scmimporter/SCM2DB : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:634) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:277) at java.net.URLClassLoader.access$000(URLClassLoader.java:73) at java.net.URLClassLoader$1.run(URLClassLoader.java:212) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:321) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:266) Could not find the main class: edu.byu.cs.sequoia.scmimporter.SCM2DB. Program will exit.
However, to make full use of this tool on a large data set like the Apache Subversion repository, you will need a lot of memory. For some of our analyses we used as much as 60 gig.
LicenseThis software is released under the GNU General Public License (GPL).
Using the Script
To get a list of available commands, run
java -jar SCM2DB.jar --help
This should output the following:
Build date: Mar 27, 2013 3:42:03 PM Version: 0.1.0 try: help <command-name> Available commands Command Description help Print available commands. Also, typing help <command> will print help for the specified command. pgsql Import a SCM into a PostgreSQL database. neo4j Import a SCM into a Neo4j database. metrics Generate 2 files. The first contains comparisons between communities based on --comparison-attribute and the communities found through combinations of --edge-metric and --file-metric values as well as whole-graph community finding (unless disabled in --edge-metric). commdist Write a file that represents the distribution of community sizes in a graph for one or more clusterings.
Creating GraphsTo create a graph from a Subversion repository:
java -jar SCM2DB.jar neo4j --db-path /path/to/new/db --scm-url file:///path/to/scm
For more information, see the Javadocs.