[关闭]
@Frankchen 2016-03-28T01:28:49.000000Z 字数 2568 阅读 1468

Python Dict and File

python


Dict Hash Table

Python的哈希表结构叫做字典。基本形式为key:value的键值对的集合,被大括号包围。string数字和turple都可以作为key,任何类型都可以作为value。可以使用in或者dict.get(key)来确认key是否在字典中。

  1. ## Can build up a dict by starting with the the empty dict {}
  2. ## and storing key/value pairs into the dict like this:
  3. ## dict[key] = value-for-that-key
  4. dict = {}
  5. dict['a'] = 'alpha'
  6. dict['g'] = 'gamma'
  7. dict['o'] = 'omega'
  8. print dict ## {'a': 'alpha', 'o': 'omega', 'g': 'gamma'}
  9. print dict['a'] ## Simple lookup, returns 'alpha'
  10. dict['a'] = 6 ## Put new key/value into dict
  11. 'a' in dict ## True
  12. ## print dict['z'] ## Throws KeyError
  13. if 'z' in dict: print dict['z'] ## Avoid KeyError
  14. print dict.get('z') ## None (instead of KeyError)

for循环能遍历一个字典的所有的key,而key的顺序是任意的。dict.keysdict.values返回所有的key或者value。还有items(),它返回一系列的(key, value) tuple,这是最有效的确认字典中所有的键值数据的方法。这些list都可以传递给sorted函数。

  1. ## By default, iterating over a dict iterates over its keys.
  2. ## Note that the keys are in a random order.
  3. for key in dict: print key
  4. ## prints a g o
  5. ## Exactly the same as above
  6. for key in dict.keys(): print key
  7. ## Get the .keys() list:
  8. print dict.keys() ## ['a', 'o', 'g']
  9. ## Likewise, there's a .values() list of values
  10. print dict.values() ## ['alpha', 'omega', 'gamma']
  11. ## Common case -- loop over the keys in sorted order,
  12. ## accessing each key/value
  13. for key in sorted(dict.keys()):
  14. print key, dict[key]
  15. ## .items() is the dict expressed as (key, value) tuples
  16. print dict.items() ## [('a', 'alpha'), ('o', 'omega'), ('g', 'gamma')]
  17. ## This loop syntax accesses the whole dict by looping
  18. ## over the .items() tuple list, accessing one (key, value)
  19. ## pair on each iteration.
  20. for k, v in dict.items(): print k, '>', v
  21. ## a > alpha o > omega g > gamma

有一种变体的iterkeys(), itervalues() , iteritems()可以避免建造全部的list,这在数据量很大的时候常用。

Dict Formatting

%操作符方便的把字典中的value代替为字符串:

  1. hash = {}
  2. hash['word'] = 'garfield'
  3. hash['count'] = 42
  4. s = 'I want %(count)d copies of %(word)s' % hash # %d for int, %s for string
  5. # 'I want 42 copies of garfield'

Del

del操作符删除元素,如:

  1. var = 6
  2. del var # var no more!
  3. list = ['a', 'b', 'c', 'd']
  4. del list[0] ## Delete first element
  5. del list[-2:] ## Delete last two elements
  6. print list ## ['b']
  7. dict = {'a':1, 'b':2, 'c':3}
  8. del dict['b'] ## Delete 'b' entry
  9. print dict ## {'a':1, 'c':3}

Files

open()函数打开并且返回一个文件代号,这可以接下来用来读或者写操作。f = open('name','r')的含义是打开一个文件传递给变量f,准备进行读操作,可以用f.close()关闭。还可以使用'w'用来写,'a'用来添加。特殊的'rU'用来将不同的行尾符转化为'\n'for用来遍历文件的每一行很有效,不过注意这只对text文件有效,对二进制文件不起作用。

  1. # Echo the contents of a file
  2. f = open('foo.txt', 'rU')
  3. for line in f: ## iterates over the lines of the file
  4. print line, ## trailing , so print does not add an end-of-line char
  5. ## since 'line' already includes the end-of line.
  6. f.close()

每次读一行的操作可以避免使用过多的内存。f.readlines()method读整个文件加入内存,并且返回一个由每一行组成的list。而f.read()method读整个文件为一条字符串。
对于写操作来说,f.write()method是把数据写入一个打开的输出文件的最简单的方法。或者用print >> f, string来打印到屏幕。

Files Unicode

codecs模块提供对于对于读取Unicode文件的支持。

  1. import codecs
  2. f = codecs.open('foo.txt', 'rU', 'utf-8')
  3. for line in f:
  4. # here line is a *unicode* string
添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注