]> code.communitydata.science - cdsc_reddit.git/tree
Cache before sorting so we don't extract twice.
-rw-r--r-- 10 .gitignore
-rwxr-xr-x 1052 check_submission_shas.py
-rwxr-xr-x 5147 comments_2_parquet.py
-rwxr-xr-x 437 pull_pushshift_comments.sh
-rwxr-xr-x 732 pull_pushshift_submissions.sh
-rwxr-xr-x 5877 submissions_2_parquet_part1.py
-rw-r--r-- 1739 submissions_2_parquet_part2.py

Community Data Science Collective || Want to submit a patch?