How to Extract English Fiction Prose from Project Gutenberg
Step 3: Extract book data from RDF files
First, copy the rdf2sql executable from the previous step into the directory with RDF files from step 1. Then execute it.
If the rdf2sql executable is located in ~/dev/misc/rdf2sql and the RDF files are located in rdf, you can create the following script rdf2sql.sh:
cp ~/dev/misc/rdf2sql/target/rdf2sql-1.2-jar-with-dependencies.jar . echo "$(date "+%Y-%m-%d %H:%M:%S"): Start to convert RDF files to SQL" java -jar rdf2sql-1.2-jar-with-dependencies.jar rdf echo "$(date "+%Y-%m-%d %H:%M:%S"): Done"
Then execute it like this:
./rdf2sql.sh 2>&1 | tee rdf2sql.sh.log
After about two hours, this will generate a SQL script rdf.sql inside the rdf/cache directory. You can find a compressed version of that file here.