Python Markup
Mostly python-creole, the little gem ...
https://pypi.python.org/pypi/python-creole
It's big and has a large community, but ...
https://pypi.python.org/pypi/Markdown
Schema.org is a collaborative, community activity with a mission to create, maintain, and promote schemas for structured data on the Internet, on web pages, in email messages, and beyond.
Schema.org vocabulary can be used with many different encodings, including RDFa, Microdata and JSON-LD. These vocabularies cover entities, relationships between entities and actions, and can easily be extended through a well-documented extension model.
Over 10 million sites use Schema.org to markup their web pages and email messages. Many applications from Google, Microsoft, Pinterest, Yandex and others already use these vocabularies to power rich, extensible experiences.
http://schema.org/docs/documents.html
http://schema.org/docs/datamodel.html
https://schema.org/docs/gs.html
https://github.com/schemaorg/schemaorg
Python Wiki Engines
https://wiki.python.org/moin/PythonWikiEngines
Mediawiki Markup
PythonTrac and PythonMoin markup are largely compatible with Mediawiki markup.
mwparserfromhell
http://mwparserfromhell.readthedocs.org/en/latest/
https://github.com/earwig/mwparserfromhell/
mwparserfromhell (the MediaWiki Parser from Hell) is a Python package that provides an easy-to-use and outrageously powerful parser for MediaWiki wikicode.
It supports Python 2 and Python 3 ...
... originally developed for EarwigBot
Jan 2015
https://github.com/earwig/earwigbot
A Python robot that edits Wikipedia and interacts with people over IRC
Mediawiki Parser
https://github.com/peter17/mediawiki-parser
This is a parser for MediaWiki's (MW) syntax. It's goal is to transform wikitext into an abstract syntax tree (AST) and then render this AST into various formats such as plain text and HTML ...
You must install the latest version of Pijnu ...
Nov 2012
https://github.com/peter17/pijnu
Pijnu is a PEG parser generator and processor, written in Python, intended to be clear, easy, practical ...
Pijnu syntax is a custom, extended version of Parsing Expression Grammars (PEG); which itself is a kind of mix of BNF and regular expressions.
The major difference is that PEG is a grammar to express string recognition patterns, while BNF or regexp express string generation. As a consequence, PEG is better suited for parsing tasks. A PEG grammar clearly encodes the algorithm to parse a source string, that simply needs to be rewritten into a parser coded in a programming language.
April 2012
SMC.MW
https://github.com/lambdafu/smc.mw/
A MediaWiki parser for Python.
... requires=["lxml", "grako"] ...
Dec 2014
https://pypi.python.org/pypi/grako/3.5.1
A generator of PEG/Packrat parsers from EBNF grammars.
March 2015
http://pythonhosted.org//grako/
Grako (for grammar compiler) is a tool that takes grammars in a variation of EBNF as input, and outputs memoizing (Packrat) PEG parsers in Python.
Grako is different from other PEG parser generators:
Generated parsers use Python's very efficient exception-handling system to backtrack. Grako generated parsers simply assert what must be parsed. There are no complicated if-then-else sequences for decision making or backtracking. Memoization allows going over the same input sequence several times in linear time.
Positive and negative lookaheads, and the cut element (with its cleaning of the memoization cache) allow for additional, hand-crafted optimizations at the grammar level.
Delegation to Python's re module for lexemes allows for (Perl-like) powerful and efficient lexical analysis.
The use of Python's context managers considerably reduces the size of the generated parsers for code clarity, and enhanced CPU-cache hits.
Include files, rule inheritance, and rule inclusion give Grako grammars considerable expressive power.
Efficient support for direct and indirect left recursion allows for more intuitive grammars.
Grako, the runtime support, and the generated parsers have measurably low Cyclomatic complexity. At around 4.5 KLOC of Python, it is possible to study all its source code in a single session.
Grako's only dependencies are on the Python 2.7, 3.4, or PyPy 2.3 standard libraries.
PyWikiBot
https://www.mediawiki.org/wiki/Category:Pywikibot
https://www.mediawiki.org/wiki/Category:Pywikibot_scripts
https://www.mediawiki.org/wiki/Manual:Pywikibot
https://github.com/wikimedia/pywikibot-core
https://en.wikibooks.org/wiki/Pywikibot
https://pywikibot.readthedocs.org/en/latest/
https://en.wikiversity.org/wiki/Pywikipediabot
Pywikibot Nightlies - http://tools.wmflabs.org/pywikibot/
http://tools.wmflabs.org/pywikibot/core/
MWClient
https://github.com/mwclient/mwclient
mwclient is a lightweight Python client library to the MediaWiki API which provides access to most API functionality.
Wikitools
https://github.com/alexz-enwp/wikitools
api.py - Contains the APIRequest class, for doing queries directly, see API examples below
wiki.py - Contains the Wiki class, used for logging in to the site, storing cookies, and storing basic site information
page.py - Contains the Page class for dealing with individual pages on the wiki. Can be used to get page info and text, as well as edit and other actions if enabled on the wiki
category.py - Category is a subclass of Page with extra functions for working with categories
wikifile.py - File is a subclass of Page with extra functions for working with files - note that there may be some issues with shared repositories, as the pages for files on shared repos technically don't exist on the local wiki.
user.py - Contains the User class for getting information about and blocking/unblocking users
pagelist.py - Contains several functions for getting a list of Page objects from lists of titles, pageids, or API query results
https://code.google.com/p/python-wikitools/wiki/Documentation
Mediawiki API
http://www.mediawiki.org/wiki/API:Main_page
http://www.mediawiki.org/wiki/Category:MediaWiki_Development Page includes http://www.mediawiki.org/wiki/Semantic_Bundle and http://www.mediawiki.org/wiki/Markup_spec
http://www.mediawiki.org/wiki/Etherpad_index
Most Powerful Python Markup Packages
Embedded HTML from https://devcharm.com/articles/203/most-powerful-markup-packages-in-python/
Reports with plots
IPython - advanced Python shell and notebook tool that can handle everything from Python to plots, including **LaTeX math notation.
pyreport - generate reports from Python scripts combining text, code, and plots.
Formatting source code
pygments - powerful tool to parse source code from virtually all languages and formats it to HTML, RTF, ASCII or LaTeX.
HTML pages
Markdown - the elegant markup format used by devcharm.
docutils - parses reStructuredText markup and formats it to HTML.
markup.py - straightforward HTML/XML generator.
Parsing And Generating In General
Haven't gone through each one ... yet.
http://pyparsing.wikispaces.com/
https://pypi.python.org/pypi/parse
https://pypi.python.org/pypi/python-creole/
https://pypi.python.org/pypi/Markdown
https://pythonhosted.org/Markdown/extensions/index.html
https://www.crummy.com/software/BeautifulSoup/
Beautiful Soup is a Python library designed for quick turnaround projects like screen-scraping. Three features make it powerful:
Beautiful Soup provides a few simple methods and Pythonic idioms for navigating, searching, and modifying a parse tree: a toolkit for dissecting a document and extracting what you need. It doesn't take much code to write an application
Beautiful Soup automatically converts incoming documents to Unicode and outgoing documents to UTF-8. You don't have to think about encodings, unless the document doesn't specify an encoding and Beautiful Soup can't detect one. Then you just have to specify the original encoding.
Beautiful Soup sits on top of popular Python parsers like lxml and html5lib, allowing you to try out different parsing strategies or trade speed for flexibility.