Notes: Google Python Class
From Wikinology
Contents |
Day 1
Part 1: Introduction and Strings
- Common first line of the script:
#!/usr/bin/python -tt(Better: #!/usr/bin/env python)
- Common start point for stand alone module:
if __name__ == '__main__': main()
- String concatenation: 'Hello' + str(6).
- Comments start with "#".
- Logic operator: "and", "or" and "not".
- Format a string:
'Hi %s, I have %d donuts.' % ('Alice', 42)
- String slice: starts from m and goes up to, but not include, n.
a = 'Hello' a[1:3] # el a[1:] # ello a[:3] # Hel
- H e l l o
- 0 1 2 3 4
- -5 -4 -3 -2 -1
- Module 'sys'
- sys.argv -- a list with the first element as the name of the executing file itself
- Useful built-in functions:
- len(list), dir(sys) and help(sys)
Part 2: Lists, Sorting, and Tuples
- By assigning a to b, it does not make a copy.
a = [1, 2, 3] b = a a[0] = 4 print b[0] # 4
- To create a copy, use "b = a[:]".
- To check if a value is in the list, use "in" statement:
a = [1, 2, 3] 1 in a # True 4 in a # False
- "append(object)" to the end and "pop([index])" the last, if not specified, object -- change the list itself
- "del" removes the reference, but the object itself will be fine.
a = [1, 2, 3] b = a del a print b # [1, 2, 3]
- sorted(iterable, cmp=None, key=None, reverse=False) --> new sorted list
- "key" can be a function which will be used to apply on each item in the list, and "sorted" will sort the list according to the return value of the "key" function.
a = [1, 3, 2] sorted(a) # [1, 2, 3] sorted(a, reverse=True) # [3, 2, 1] print a # [1, 3, 2] l = ['aaa', 'b', 'cc'] sorted(l, key=len) # ['b', 'cc', 'aaa']
- Concatenation: "xxx".join(list); dissolve: string.split(sep)
a = ['a', 'b', 'c'] b = ':'.join(a) # 'a:b:c' b.split(':') # ['a', 'b', 'c']
- When "sorted" works on a list of tuple, it compares the first item in each tuple first; if the first items are equal, it goes to the second one.
- To sort something by several criteria, one could compose a key function to return a list of tuple.
Part 3: Dicts and Files
- d['x'] --> error / d.get('x') --> null
- 'x' in d --> returns True if there is a key named 'x' in the dict
- d.keys() / d.values() # same random order
- d.items() returns a list of tuples with key/value pair in each tuple
- open(filename, 'rU') --> 'U' argument handles the end of line across different platform
- print something, --> removes the end of line coming with "print"
- To read lines in one file:
for line in f: print line, lines = f.readlines() # returns a list of lines text = f.read()
- Tips: incremental programming; good variable names pays.
Day 2
Part 1: Regular Expressions
- re.search(pat, text) --> returns a "match" object
- if match: match.group() --> ensures not to call the function upon null object
- Wild card:
- . (dot) any char
- \w word char (letter, digit and underscore bar)
- \d digit
- \s whitespace; \S non-whitespace
- + one or more
- * zero or more
- escape character: \ (e.g. "\." represents a real dot)
- Use [ ] to embrace a set of characters; dot (.) in [ ] represents a real dot.
- Use ( ) to emphasize the set of characters we really care; then use index to retrieve the corresponding result.
m = re.search(r'([\w.]+)@([\w.]+)', 'blah john.k@gmail.com yatta @ ') print m.group() # john.k@gmail.com print m.group(1) # john.k print m.group(2) # gmail.com
- re.findall(pat, text) --> returns a list of matches
- third parameter "flag": IGNORECASE, DOTALL, etc.
Part 2: Utilities: OS and Commands
os module
- os.listdir(path)
- os.path.join(...), os.path.abspath(...), os.path.exists(...)
shutil module
- shutil.copy(source, dest)
commands module
- commands.getstatusoutput(cmd) --> returns (status, output)
cmd = 'ls -l ' + dir (status, output) = commands.getstatusoutput(cmd) if status: sys.stderr.write('There was an error: ' + output) sys.exit(1) print output
Part 3: Utilities: URLs and HTTP, Exceptions
- Exception handler:
try: f = open(filename) text = f.read() print text except IOError: print 'IO Error:', filename
- Reuse self-built modules, e.g. trythis.py: import trythis
- Triple quotes allows strings to be across multiple lines.
"""
This is a comment
in multiple lines.
"""urllib module
uf = urllib.urlopen('http://www.google.com') uf.read() urllib.urlretrieve('http://***/foo.gif', 'bar.gif')
Part 4: Closing Thoughts
- Comprehension syntax -- incorporates for statement with if statement:
a = ['aa', 'b', 'ccc'] [ len(s) for s in a ] # [2, 1, 3] b = [1, 2, 3, 4] [ num * num for num in b if num > 2 ] # [9, 16]
- With great power comes great responsibility.
