Python Note
一. 主要数据结构
1. list
list表示一个列表,可以自由变换长度。用[]括起来表示。
lista = [1,2,3,4]
lista = ['a','b','c']
listc = [1,'a',4]
2. tuple
tuple表示元组,是不可变(immutable)对象,即创建后大小不能改变,但其中的元素可以变换。用()括起来表示。
tuplea = (1,2,3)
tupleb = ('a','b','c')
tuplec = (1,'b',3)
3. dict
dictionary相当于一个hashmap。用{}括起来表示。i.e. dict=[] dict[1]='a'
dict = {}
dict[1] = 'a'
dictc['b'] = 2
二. String常用方法
- s.lower(), s.upper() -- returns the lowercase or uppercase version of the string
- s.strip() -- returns a string with whitespace removed from the start and end
- s.isalpha()/s.isdigit()/s.isspace()... -- tests if all the string chars are in the various character classes
- s.startswith('other'), s.endswith('other') -- tests if the string starts or ends with the given other string
- s.find('other') -- searches for the given other string (not a regular expression) within s, and returns the first index where it begins or -1 if not found
- s.replace('old', 'new') -- returns a string where all occurrences of 'old' have been replaced by 'new'
- s.split('delim') -- returns a list of substrings separated by the given delimiter. The delimiter is not a regular expression, it's just text. 'aaa,bbb,ccc'.split(',') -> ['aaa', 'bbb', 'ccc']. As a convenient special case s.split() (with no arguments) splits on all whitespace chars.
- s.join(list) -- opposite of split(), joins the elements in the given list together using the string as the delimiter. e.g. '---'.join(['aaa', 'bbb', 'ccc']) -> aaa---bbb---ccc
三. List常用方法
- list.append(elem) -- adds a single element to the end of the list. Common error: does not return the new list, just modifies the original.
- list.insert(index, elem) -- inserts the element at the given index, shifting elements to the right.
- list.extend(list2) adds the elements in list2 to the end of the list. Using + or += on a list is similar to using extend().
- list.index(elem) -- searches for the given element from the start of the list and returns its index. Throws a ValueError if the element does not appear (use "in" to check without a ValueError).
- list.remove(elem) -- searches for the first instance of the given element and removes it (throws ValueError if not present)
- list.sort() -- sorts the list in place (does not return it). (The sorted() function shown below is preferred.)
- list.reverse() -- reverses the list in place (does not return it)
- list.pop(index) -- removes and returns the element at the given index. Returns the rightmost element if index is omitted (roughly the opposite of append()).
四. 文件操作
- open(路径+文件名,读写模式)
读写模式:r只读,r+读写,w新建(会覆盖原有文件),a追加,b二进制文件。rU 或 Ua 以读方式打开, 同时提供通用换行符支持 (PEP 278)。
Windows系统,换行符'\n\r';Unix系统,换行符'\n',Mac系统,换行符'\r'
filenames = os.listdir(dir) -- list of filenames in that directory path (not including . and ..). The filenames are just the names in the directory, not their absolute paths.
- os.path.join(dir, filename) -- given a filename from the above list, use this to put the dir and filename together to make a path
- os.path.abspath(path) -- given a path, return an absolute form, e.g. /home/nick/foo/bar.html
- os.path.dirname(path), os.path.basename(path) -- given dir/foo/bar.html, return the dirname "dir/foo" and basename "bar.html"
- os.path.exists(path) -- true if it exists
- os.mkdir(dir_path) -- makes one dir, os.makedirs(dir_path) makes all the needed dirs in this path
shutil.copy(source-path, dest-path) -- copy a file (dest path directories should exist)
五. HTTP操作
ufile = urllib.urlopen(url) -- returns a file like object for that url
- text = ufile.read() -- can read from it, like a file (readlines() etc. also work)
- info = ufile.info() -- the meta info for that request. info.gettype() is the mime time, e.g. 'text/html'
- baseurl = ufile.geturl() -- gets the "base" url for the request, which may be different from the original because of redirects
- urllib.urlretrieve(url, filename) -- downloads the url data to the given file path
- urlparse.urljoin(baseurl, url) -- given a url that may or may not be full, and the baseurl of the page it comes from, return a full url. Use geturl() above to provide the base url.