]> code.communitydata.science - cdsc_reddit.git/tree
Create parquet datasets of reddit submissions from pushshift.
-rw-r--r-- 10 .gitignore
-rwxr-xr-x 1052 check_submission_shas.py
-rwxr-xr-x 4825 comments_2_parquet.py
-rwxr-xr-x 437 pull_pushshift_comments.sh
-rwxr-xr-x 732 pull_pushshift_submissions.sh
-rwxr-xr-x 7351 submissions_2_parquet.py

Community Data Science Collective || Want to submit a patch?