]> code.communitydata.science - mediawiki_dump_tools.git/shortlog
mediawiki_dump_tools.git
2021-10-18 Nathan TeBlunthuisuse dataclasses and pyarrow for types.
2021-10-17 Nathan TeBlunthuisinitial work on parquet support
2019-11-11 Nathan TeBlunthuisremove commented code master
2019-11-09 Nathan TeBlunthuisrefactor regex matching in a tidier object oriented...
2019-11-09 Nathan TeBlunthuisvalidate tests and add asserts and baselines for regex...
2019-11-07 sohyeonhwangadded regex scanner v2's dump unit test file regextest...
2019-11-07 sohyeonhwangmerging pull containing revert-radius with 2nd version...
2019-10-07 groceryheistadd unit tests for configuring revert_radius
2019-10-07 groceryheistmake revert radius configurable
2019-10-06 groceryheistMerge branch 'master' into regex_scanner
2019-10-05 groceryheistupdate baseline outputs
2019-10-05 groceryheistbugfix, remove old legacy persistence flag
2019-10-05 sohyeonhwangchanges for regex scanner addition
2019-09-22 groceryheistedont compute persistence by default
2019-09-22 groceryheistelaborate docstring for persistence
2018-09-03 groceryheistimprove help for namespace-include
2018-09-03 groceryheistsub assertEquals assertEqual advanced_persistence
2018-09-03 Nate E TeBlunthuisadd namespace filter parameter
2018-08-24 groceryheistMerge branch 'advanced_persistence' of code.communityda...
2018-08-24 groceryheistAdd parameter for selecting specific namespaces.
2018-08-24 groceryheistMerge branch 'advanced_persistence' of code.communityda...
2018-08-24 Nate E TeBlunthuisMerge branch 'advanced_persistence' of code.communityda...
2018-08-24 Nate E TeBlunthuisadd namespace filter parameter
2018-08-24 Nate E TeBlunthuisMerge branch 'advanced_persistence' of code.communityda...
2018-08-24 Nate E TeBlunthuisadd namespace filter parameter
2018-08-24 Nate E TeBlunthuisadd namespace filter parameter
2018-08-20 groceryheistadd support for persistence with segment matching
2018-07-10 groceryheistPrefix page titles with namespace names. mediawiki-utils-migration
2018-07-05 groceryheistmigrate to mwxml. This completes the migration away...
2018-07-05 groceryheistmigrate to mwpersistence. this fixes many issues. We...
2018-07-04 groceryheistmigrate reverts to python-mwreverts
2018-07-04 groceryheistadd note to readme about dependency on compression...
2018-07-04 groceryheistadd tests for wikipedia, malformed xml, bzip2, correct...
2018-07-04 groceryheistcreate baseline tests for xml dump processing
2018-05-17 Benjamin Mako... a number of small updates and fixes
2017-12-07 groceryheistsupport 7z archives with multiple files. add urlencode...
2017-02-07 Benjamin Mako... fix code to work with bzip files
2015-07-23 Benjamin Mako... added list of compressed dump files to .gitignore
2015-07-23 Benjamin Mako... added support to parse namespaces from title
2015-07-23 Benjamin Mako... added README file to document the submodule
2015-07-23 Benjamin Mako... created new repository for wikiq with Mediawiki-Utiliti...

Community Data Science Collective || Want to submit a patch?