]> code.communitydata.science - cdsc_reddit.git/tree
Move the spark part of submissions_2_parquet to a separate script.
-rw-r--r-- 10 .gitignore
-rwxr-xr-x 1052 check_submission_shas.py
-rwxr-xr-x 5077 comments_2_parquet.py
-rwxr-xr-x 437 pull_pushshift_comments.sh
-rwxr-xr-x 732 pull_pushshift_submissions.sh
-rwxr-xr-x 5877 submissions_2_parquet_part1.py
-rw-r--r-- 1739 submissions_2_parquet_part2.py

Community Data Science Collective || Want to submit a patch?