Help:機械人

捷徑

H:GHY

Python

點樣用 pywikipedia

捷徑

[[:help:pywikipedia]]

meta:python wikipedia bot, meta:wikipedia.py

響 http://python.org/download down撈 python。
安裝 python
裝pywikipediabot
1. 響 http://sourceforge.net/projects/pywikipediabot/ down撈 m:python wikipedia bot。或
2. 經Subversion下載最新版 http://svn.wikimedia.org/svnroot/pywikipedia/trunk/pywikipedia/ 。

試頑：撈廣東話維基百科頭版：整個咁嘅快佬 C:\python\loadGuongDongWhaTouBan.py

import wikipedia
enWikisrcSite = wikipedia.getSite('zh-yue', 'wikipedia') # loading a defined project's page
page = wikipedia.Page(enWikisrcSite, '%E9%A0%AD%E7%89%88')
text = page.get() # Taking the text of the page
print text
wikipedia.stopme()

然後入 C:\python\python loadGuongDongWhaTouBan.py (無腦嘅指醒：如果你用 windows, 記住啲快佬同 python.exe 要放響同一個 directory。)

要玩真嘅，就要開快佬 user-config.py：

設定

user-config.py

family='wikipedia'
mylang='zh-yue'
usernames['wikipedia']['zh-yue'] = 'R. Hillgentleman'
usernames['wikiversity']['beta'] = 'R. Hillgentleman'
console_encoding = 'utf-8'

例牌程式

pywikiBoilerplate.py

import wikipedia
#set up
site = wikipedia.getSite()
page = wikipedia.Page(site, u"pageName")
 
#to get a page:
text = page.get(get_redirect = True)
 
#to update a page:
page.put(u"newText", u"Edit comment")
 
#CategoryPageGenerator
import wikipedia
import pagegenerators
import catlib

site = wikipedia.getSite()
cat = catlib.Category(site,'Category:%E9%A1%9E') # %E9%A1%9E={{subst:urlencode:類}}
gen = pagegenerators.CategorizedPageGenerator(cat)
for page in gen:
  #Do something with the page object, for example:
  text = page.get()

編輯

響wikipedia:sandbox加句嘢：

import wikipedia
# Define the main function
def main():
    site = wikipedia.getSite()
    pagename = 'wikipedia:Sandbox'
    page = wikipedia.Page(site, pagename)
    wikipedia.output(u"Loading %s..." % pagename) # Please, see the "u" before the text
    try:
        text = page.get(force = False, get_redirect=False, throttle = True, sysop = False, 
                                             nofollow_redirects=False, change_edit_time = True) # text = page.get() <-- is the same
    except wikipedia.NoPage: # First except, prevent empty pages
        text = ''
    except wikipedia.IsRedirectPage: # second except, prevent redirect
        wikipedia.output(u'%s is a redirect!' % pagename)
        exit()# wikipedia.stopme() is in the finally, we don't need to use it twice, exit() will only close the script
    except wikipedia.Error: # third exception, take the problem and print
        wikipedia.output(u"Some Error, skipping..")
        exit()
    newtext = text + '\nHello, World!'
    page.put(newtext, comment='Bot: Test', watchArticle = None, minorEdit = True)  # page.put(newtext, 'Bot: Test') <-- is the same
 
if __name__ == '__main__':
    try:
        main()
    finally:
        wikipedia.stopme()

category:類入面有邊幾頁？

查下category:類入面有邊幾頁，然後氹落wikipedia:sandbox度。[1]

getCategoryWriteYue.py

import wikipedia
import pagegenerators
import catlib

site = wikipedia.getSite()
cat = catlib.Category(site,'Category:%E9%A1%9E')   # %E9%A1%9E={{subst:urlencode:類}}
gen = pagegenerators.CategorizedPageGenerator(cat)
list=''
for page in gen:
  list =list+ '\n' + page.title()

#write the list at the end of [[wikipedia:sandbox]]
sandbox = wikipedia.Page(site, u"wikipedia:sandbox")
sandboxtext = sandbox.get(get_redirect = True)

sandboxtext = sandboxtext + '\n' + list

sandbox.put(sandboxtext, comment='Mechanical test: get pages in [[Category:%E9%A1%9E]] and dump them on [[wikipedia:sandbox]]', watchArticle = None, minorEdit = True)

揾下一類之內有邊幾類

C:\python>python category.py -family:wikipedia -lang:zh-yue listify -from:%E7%B6%AD%E5%9F%BA%E7%99%BE%E7%A7%91 (如果加埋" -recurse:True "，咁啲子類嘅子嘅子類。。。都包埋）[2]。

查下一類之內有邊幾頁


    
編輯

C:\python>python pagegenerators.py -family:wikipedia -lang:zh-yue subcat:%E7%B6%AD%E5%9F%BA%E7%99%BE%E7%A7%91

搬類


    
編輯

C:\python>python category.py move -from:A類  -to:B類  # 中文字要用啲%%%嘅unicode表示

示範：[3]

改 regex


    
編輯

C:\python>python replace.py -linkes:pagename -regex "Template:(.*?)" "Template:\1"

或者簡單嘅見字代字：

 C:\python>python replace.py -cat:A類 "xxx舊字" "yyy新字"

查下wikipedia:sandbox嘅沿革


    
編輯

#getting the history of [[wikipedia:sandbox]]
import wikipedia #importing the wikipedia.py module
pg = wikipedia.Page(wikipedia.getSite(), 'wikipedia:sandbox') #creating the Page object corresponding to [[wikipedia:sandbox]]
x=pg.getVersionHistoryTable() #calling the function getVersionHistoryTable() in the wikipedia.py module
print x

查下Main Page嘅沿革，然後氹落wikipedia:sandbox


    
編輯

getSandboxHistoryWrite.py

#get the history of a [[Main Page]] and dump it on [[wikipedia:sandbox]]

import wikipedia #importing the wikipedia.py module
site=wikipedia.getSite()  #setting the variable site = (wikipedia, zh-yue)
pg = wikipedia.Page(site, 'Main Page') #creating the Page object corresponding to [[wikipedia:sandbox]]
x=pg.getVersionHistoryTable() #calling the function getVersionHistoryTable() in the wikipedia.py module

#writing the result on [[wikipedia:sandbox]]
sand = wikipedia.Page(site, 'wikipedia:sandbox') #create the object corresponding to sandbox
y = sand.get()   #read sandbox
y = y+x           #append x, the page history of [[Main Page]]
sand.put( y , 'Robot testing: dumping the history of [[Main Page]] on [[wikipedia:sandbox]]')  #write

示範：[4]

查下template:copyvio嘅編者，然後氹落wikipedia:sandbox


    
編輯

getPageContributingUsersWrite.py

#get the history of [[template:copyvio]] and dump it on [[wikipedia:sandbox]]

import wikipedia #importing the wikipedia.py module
site=wikipedia.getSite()  # setting the site, from configuration
pg = wikipedia.Page(site, 'template:copyvio') #creating the Page object in question
x=pg.contributingUsers()  #getting the contributing users


sand = wikipedia.Page(site, 'wikipedia:sandbox')
y = sand.get() #getting the current sandbox
for i in x:
  y = y+i   #appending the crap
sand.put( y , 'Robot testing: getting the contributing users of the page [[template:copyvio]] and dump the result on [[wikipedia:sandbox]]')

示範：
[5]

查下 special:newpages


    
編輯

#to get a list of newpages from wikipedia.newpages()
import wikipedia
site=wikipedia.getSite()

newPageList = site.newpages()
for i in newPageList:
 page, timestamp, length, empty, username, comment = i
 t = page.title()
 wikipedia.output('User:'+username+'.....Title:'+t)

示範：

C:\Python25\pywikipedia>python newPages.py
Checked for running processes. 2 processes currently running, including the curr
ent process.
User:Contributions/219.79.136.159.....Title:闆呰檸鐭ヨ瓨+
User:Contributions/219.79.136.159.....Title:XD
User:Contributions/219.79.136.159.....Title:Orz
User:Contributions/219.79.136.159.....Title:鍥?
User:Contributions/59.112.213.23.....Title:娓呴洸绉戞妧澶у
User:Happynewyear.....Title:1250骞?
User:Happynewyear.....Title:1251骞?
User:Happynewyear.....Title:1252骞?
User:Happynewyear.....Title:1253骞?
User:Happynewyear.....Title:1254骞?

C:\Python25\pywikipedia>

Perl


    
編輯

en:User:Shadow1/perlwikipedia
googlecode download

Help:機械人

目錄

Python

點樣用 pywikipedia

設定

登入

例牌程式

編輯

category:類入面有邊幾頁？

揾下一類之內有邊幾類

查下一類之內有邊幾頁

搬類

改 regex

查下wikipedia:sandbox嘅沿革

查下Main Page嘅沿革，然後氹落wikipedia:sandbox

查下template:copyvio嘅編者，然後氹落wikipedia:sandbox

查下 special:newpages

Perl