1 .. mediawiki-utilities documentation master file, created by
2 sphinx-quickstart on Thu Apr 10 17:31:47 2014.
3 You can adapt this file completely to your liking, but it should at least
4 contain the root `toctree` directive.
10 MediaWiki Utilities is an open source (MIT Licensed) library developed by Aaron Halfaker for extracting and processing data from MediaWiki installations, slave databases and xml dumps.
12 **Instal with pip:** ``pip install mediawiki-utilities``
14 **Note:** *Use of this library requires Python 3 or later.*
18 :ref:`mw.Timestamp <mw.types>`
19 A simple datatype for handling MediaWiki's various time formats.
24 :ref:`mw.api <mw.api>`
25 A set of utilities for interacting with MediaWiki's web API.
27 * :class:`~mw.api.Session` -- Constructs an API session with a MediaWiki installation. Contains convenience methods for accessing ``prop=revisions``, ``list=usercontribs``, ``meta=siteinfo``, ``list=deletedrevs`` and ``list=recentchanges``.
29 :ref:`mw.database <mw.database>`
30 A set of utilities for interacting with MediaWiki's database.
32 * :class:`~mw.database.DB` -- Constructs a mysql database connector with convenience methods for accessing ``revision``, ``archive``, ``page``, ``user``, and ``recentchanges``.
34 :ref:`mw.xml_dump <mw.xml_dump>`
35 A set of utilities for processing MediaWiki's XML database dumps quickly and without dealing with streaming XML.
37 * :func:`~mw.xml_dump.map` -- Applies a function to a set of dump files (:class:`~mw.xml_dump.Iterator`) using :class:`multiprocessing` and aggregates the output.
38 * :class:`~mw.xml_dump.Iterator` -- Constructs an iterator over a standard XML dump. Dumps contain site_info and pages. Pages contain metadata and revisions. Revisions contain metadata and text. This is probably why you are here.
43 :ref:`mw.lib.persistence <mw.lib.persistence>`
44 A set of utilities for tracking the persistence of content between revisions.
46 * :class:`~mw.lib.persistence.State` -- Constructs an object that represents the current content persistence state of a page. Reports useful details about the persistence of content when updated.
48 :ref:`mw.lib.reverts <mw.lib.reverts>`
49 A set of utilities for performing revert detection
51 * :func:`~mw.lib.reverts.detect` -- Detects reverts in a sequence of revision events.
52 * :class:`~mw.lib.reverts.Detector` -- Constructs an identity revert detector that can be updated manually over the history of a page.
54 :ref:`mw.lib.sessions <mw.lib.sessions>`
55 A set of utilities for grouping revisions and other events into sessions
57 * :func:`~mw.lib.sessions.cluster` -- Clusters a sequence of user actions into sessions.
58 * :class:`~mw.lib.sessions.Cache` -- Constructs a cache of recent user actions that can be updated manually in order to detect sessions.
60 :ref:`mw.lib.title <mw.lib.title>`
61 A set of utilities for normalizing and parsing page titles
63 * :func:`~mw.lib.title.normalize` -- Normalizes a page title.
64 * :class:`~mw.lib.title.Parser` -- Constructs a parser with a set of namespaces that can be used to parse and normalize page titles.
71 aaron.halfaker@gmail.com
73 http://halfaker.info --
74 http://en.wikipedia.org/wiki/User:EpochFail
79 None yet. See http://github.com/halfak/mediawiki-utilities. Pull requests are encouraged.