Wednesday, March 25, 2009

Tuesday, March 24, 2009

Saturday, March 21, 2009

Tuesday, March 17, 2009

python unicode encode & url encode

REF:http://blog.wahahajk.com/2008/10/python-unicode-encode-url-encode.html

python unicode encode & url encode

顯示unicode編碼後的byte

例如 "科"的nuicode碼是"\u79d1"

(u"abc我是中文123").encode("unicode-escape")

---->abc\u6211\u662f\u4e2d\u6587123


2.再來是如何顯示urlencode之後的編碼
import urllib

urllib.quote("a=1&b=22&c=333")

---->%3D1%26b%3D22%26c%3D333


高階應用範例:

source=u"""
"Label":["20081014","2008年10月14日"]
"""

output1=urllib.quote(source.encode("unicode-escape")) #先轉成unicode byte輸出 然後在urlencoe
output2=urllib.unquote(output1).decode("unicode-escape") #把上一步的結果urldecode回來 然後再把unicode byte轉回中文字

self.response.out.write("source= "+source)
self.response.out.write("output1= "+output1)
self.response.out.write("output2= "+output2)



輸出:

=>source= "Label":["20081014","2008年10月14日"]

=>output1= %5Cn%20%20%20%20%22Label%22%3A%5B%2220081014%22%2C%222008%5Cu5e7410%5Cu670814%5Cu65e5%22%5D%5Cn%20%20%20%20

=>output2= "Label":["20081014","2008年10月14日"]

Friday, March 13, 2009

Is there any isString, isList function in Python?

There is a function similar to that. But better than that.

from types import *

#to test it a Object is String
>>>type("string") is StringType
True
>>>type(124) is IntType
True

Thursday, March 12, 2009

在ubuntu 上手動安裝 MySQLdb for Python2.6.

先要有
Python2.6
zlib-1.2.3
MySQL-python-1.2.3b1
-------------------------------------------------------------
先完成
python2.6安裝
zlib-1.2.3安裝
$sudo apt-get install mysql-server mysql-client

如果在python import MySQLdb 時出現以下訊息
File "/usr/lib/python2.3/site-packages/MySQLdb/__init__.py", line 19,
in ?
ImportError: No module named _mysql


安裝
$sudo apt-get install libmysqlclient15-dev
Package Description
libmysqlclient15-dev 5.0.51a-3ubuntu5.4
libmysqlclient15off 5.0.51a-3ubuntu5.4

最後安裝:
MySQL-python-1.2.3b1
sudo python setup.py build
sudo python setup.py install.

ref:
http://ubuntuforums.org/archive/index.php/t-369409.html

installing Python2.6 in ubuntu and fix dependency problems

http://www.lysium.de/blog/index.php?/archives/229-Installing-Python-2.6-on-Ubuntu-8.04.html

I thought that's all, however I get an error message like this:

Failed to find the necessary bits to build these modules:
_hashlib _ssl bsddb185
bz2 gdbm readline
sunaudiodev
----------------------------------------------------------------------------
apt-get build-dep python2.5
make

Failed to find the necessary bits to build these modules:
bsddb185 sunaudiodev
---------------------------------------------------------
download patch

patch -p1 < python2.6-disable-old-modules.patch
make
sudo make install

Django & Sqlite3 can't open the database file problem.

Everything is right in the code. It runs normal in the django http server (manage.py runserver). But it display the following message when you using apache + mod_python in linux server.
"OperationalError: unable to open database file "

One the reasion is that the access right of the apache or mod_python have no right to wirte the database file or project folder. Try to modify the access of project folder and database file.

some ref:
http://mail.python.org/pipermail/python-list/2004-February/248859.html

Wednesday, March 11, 2009

using python2.6 but mod_python still paython2.5

http://anibalpacheco.com/blog/post/p-37-python26-with-mod_python-in-debian/

As you may know, there are no packages yet for Python2.6 in debian or ubuntu. If you want to install it you can do it by hand and it will work perfectly (it did for me)

Google a bit and you will find this post: http://blog.pythonaro.com/2008/10/horrible-hack-to-get-python-26-on.html

You can follow the steps in the post without problem, then remember to update your supported and default versions under /usr/share/pyhton/debian_defaults

Also you can link modules available in 2.5 to be used by 2.6, for example Ipython, it works perfectly. Do symlinks from python2.6/site_packages/ pointing python2.5/site_packages. Although this is not the correct way, the correct way should be reinstall those modules under your new configuration.

But what about mod_python? it still remains to Python2.5 right? YES

Well, you have to do the same, download it from modpython.org and make and make install, be sure to install apache2-dev it will provide you with a tool needed in the compilation time.

All should work, also django is working perfectly in my new Python & mod_python 2.6 conf. under debian.

cheers!


http://blog.pythonaro.com/2008/10/horrible-hack-to-get-python-26-on.html

I wanted to try out the newly-released 2.6 version of our beloved Python, but unfortunately Debian didn't have a package for it yet (and it still doesn't). I wasn't too afraid of screwing up my laptop, as it's probably going to be formatted very soon anyway, and I didn't want to mess around with deb build scripts, so this is what I've done:

  1. got the official source distribution, untarred and cd in the resulting dir Python2.6
  2. got some additional packages: apt-get install tk8.4-dev libgdbm-dev libdb-dev libreadline-dev libsqlite3-dev libncurses5-dev (and possibly a few others)
  3. ./configure --prefix=/usr --enable-ipv6
  4. make
  5. checkinstall -D --pkgname=python2.6 --pkgversion=2.6 --maintainer=g.lacava@gmail.com --inspect --backup=yes --install=no make altinstall
    This command allowed me to review the package contents and remove what I didn't need, which is basically everything outside the "python2.6" directories and which might already exist on my system (so I didn't want to overwrite it).
    I took out the lines /usr/bin/pydoc, /usr/bin/idle and /usr/share/man/man1/python.1
    UPDATE: when checkinstall asks if you want to create a default set of docs, say "yes", or you might get an error about ranlib further down (see comments).
  6. installed the produced .deb package
  7. copied back pydoc and idle (from the build directory) and /usr/share/man/man1/python.1 (from the Misc directory), all with "2.6" appended. I then set up alternatives with update-alternatives --install symlink name alternative priority (mainly in order to "redebianize" my impure karma); UPDATE: well, using alternatives (a 100% Debian solution which works perfectly well for loads of other multi-version script engines) will break your system, because some developers absolutely must reinvent the wheel every 5 minutes and then proudly announce that bugs won't be fixed. The stupidity of it all is staggering.

First impressions: 2.6 seems fast as hell. I don't know if this is due to the custom compilation though, rather than improvements in the runtime.

Tuesday, March 10, 2009

Change the default python version in ubuntu

http://tareqalam.wordpress.com/2008/11/28/change-the-default-python-version-in-ubuntu/

check python version
ls -l $(which python)

ubuntu 自己安裝後找不到 sqlite3 module的問題.

$ python manage.py runserver 8080
Validating models...
Unhandled exception in thread started by
Traceback (most recent call last):
File "/usr/local/lib/python2.6/site-packages/Django-1.0.2_final-py2.6.egg/django/core/management/commands/runserver.py", line 48, in inner_run
self.validate(display_num_errors=True)
File "/usr/local/lib/python2.6/site-packages/Django-1.0.2_final-py2.6.egg/django/core/management/base.py", line 246, in validate
num_errors = get_validation_errors(s, app)
File "/usr/local/lib/python2.6/site-packages/Django-1.0.2_final-py2.6.egg/django/core/management/validation.py", line 22, in get_validation_errors
from django.db import models, connection
File "/usr/local/lib/python2.6/site-packages/Django-1.0.2_final-py2.6.egg/django/db/__init__.py", line 16, in
backend = __import__('%s%s.base' % (_import_path, settings.DATABASE_ENGINE), {}, {}, [''])
File "/usr/local/lib/python2.6/site-packages/Django-1.0.2_final-py2.6.egg/django/db/backends/sqlite3/base.py", line 27, in
raise ImproperlyConfigured, "Error loading %s module: %s" % (module, exc)
django.core.exceptions.ImproperlyConfigured: Error loading sqlite3 module: No module named _sqlite3

解決方法:
重新手動安裝 sqlite3 和 python 2.6.1
先安裝sqlite3
# wget http://www.sqlite.org/sqlite-3.6.6.2.tar.gz
# tar xzvf sqlite-3.6.6.2.tar.gz
# cd sqlite-3.6.6.2
# ./configure
# make
# make install
再安裝 python 2.6.1
# wget http://www.python.org/ftp/python/2.6.1/Python-2.6.1.tgz
# tar xvf Python-2.6.1.tgz
# cd Python-2.6.1
# ./configure
# make
# make install

測試:

$ python manage.py runserver 8080
Validating models...
0 errors found

Django version 1.0.2 final, using settings 'mysite.settings'
Development server is running at http://127.0.0.1:8080/
Quit the server with CONTROL-C.

完成.

Sunday, March 8, 2009

Web::Scraper in Python

http://d.hatena.ne.jp/akkt/20070911/1189521960
http://www.packtpub.com/article/web-scraping-with-python
http://code.activestate.com/recipes/286269/
http://www.goldb.org/geo_maps/
#!/usr/bin/env python2.5
from scraper import scraper, process
import codecs, sys
sys.stdout = codecs.getwriter('utf-8')(sys.stdout)

s = scraper(
process('span.title > a:first-child', title='TEXT', url='@href'),
process('span.furigana', furigana='TEXT'),
process('ul.list-circle > li:first-child > a', category='TEXT'),
)

result = s.scrape('http://d.hatena.ne.jp/keyword/%BA%B0%CC%EE%A4%A2%A4%B5%C8%FE')
print ''.join(result['category'])
print ''.join(result['furigana'])
print ''.join(result['title'])
print ''.join(result['url'])
$ ./keyword.py
アイドル
こんのあさみ
紺野あさ美
/keyword/%ba%b0%cc%ee%a4%a2%a4%b5%c8%fe

Web::ScraperよりFlickrからサムネイルURLをとってくる例。

#!/usr/bin/env python2.5
from scraper import scraper, process
import codecs, sys
sys.stdout = codecs.getwriter('utf-8')(sys.stdout)

s = scraper(
process('a.image_link img', thumbs="@src"),
)

result = s.scrape('http://www.flickr.com/photos/bulknews/sets/72157601700510359/')

print "\n".join(result['thumbs'])

scrape.py自体はこんな感じ

#!/usr/bin/env python2.5
# -*- coding: utf-8 -*-
from urllib import urlopen
from lxml import etree

def scraper(*funcs):
class Scraper(object):
def __init__(self, funcs):
self.funcs = funcs

def scrape(self, url):
from StringIO import StringIO
stash = {}
res = urlopen(url)
html = res.read().decode(res.headers.getparam('charset') or 'latin-1')
tree = etree.parse(StringIO(html), etree.HTMLParser())
for f in self.funcs:
xpath, attr = f()
for key, val in attr.iteritems():
if val.startswith('@'):
stash[key] = [e.attrib[val[1:]] for e in tree.xpath(xpath)]
elif val.upper() == "TEXT":
stash[key] = [e.text for e in tree.xpath(xpath)]
else:
print "Got an unknown thingy: ", what
return stash

return Scraper(funcs)


def create_process(func):
def do(selector, **kwargs):
def wrap():
return func(selector, kwargs)
return wrap
return do


@create_process
def process(selector, kwargs):
from lxml.cssselect import CSSSelector
xpath = selector if selector.startswith('/') else CSSSelector(selector).path
return xpath, kwargs