]> code.communitydata.science - cdsc_reddit.git/blobdiff - checkpoint_parallelsql.sbatch
Add code for running tf-idf at the weekly level.
[cdsc_reddit.git] / checkpoint_parallelsql.sbatch
index a54aab124d2b83b2dd65a8bbdc4cd9d255e1ee0f..dd61e65c3a0d90e9791e55bdbd5b6df9b623b3f8 100644 (file)
 ## Walltime (12 hours)
 #SBATCH --time=12:00:00
 ## Memory per node
-#SBATCH --mem=100G
+#SBATCH --mem=32G
 #SBATCH --cpus-per-task=4
 #SBATCH --ntasks=1
-
-
+#SBATCH -D /gscratch/comdata/users/nathante/cdsc-reddit
+source ./bin/activate
 module load parallel_sql
-
+echo $(which perl)
+conda list pyarrow
+which python3
 #Put here commands to load other modules (e.g. matlab etc.)
 #Below command means that parallel_sql will get tasks from the database
 #and run them on the node (in parallel). So a 16 core node will have

Community Data Science Collective || Want to submit a patch?