|
Download this file.
Feb 13, 23:52 - Karl (line 1): syntax error, don't use ".py"
Feb 13, 23:53 - Karl (line 3): Syntax error, not sure if you can put that on the same line like that
Feb 13, 23:53 - Karl (line 5): Syntax error, need "speling."
Feb 13, 23:53 - Karl (line 16): Syntax error, need "speling." also line 20
Feb 13, 23:54 - Karl (line 23): I think you tried to say something like "count how many times each word occurs in misspelled" but that doesn't work (and would be an asymptotically inefficient algorithm)
Feb 13, 23:56 - Karl (line 19): If you use a list here, you should use append() instead of inserting at index 0. inserting at index 0 in what is basically an array is O(N) and makes the entire loop O(N^2) while appending is amortized O(1)
Feb 13, 23:57 - Karl (line 19): Using a dictionary instead of a list here would simplfiy lines 22-30, since it automatically gives you counting and sorting
Feb 13, 23:59 - Karl (line 23): It's a bad idea to reuse variables with different meanings. If you are trying to save memory (which doesn't really matter here) I think you can help Python garbage collect by setting old variables to None.
Please log in if you would like to add comments. | |
| 1 | import speling.py | | 2 | | | 3 | def misspell(): """returns a list of words which appear in Bush's speech but are not in our dictionary""" | | 4 | import re | | 5 | set_dictionary('dict.txt') | | 6 | input = open('input.txt') | | 7 | input = input.read() #input is now a string of words | | 8 | pat = re.compile('[\n(--).,?";!:/$1234567890]') | | 9 | input = pat.sub(' ', input) #input is now stripped of punctuation marks | | 10 | pat = re.compile('[.]') #for words like U.S. | | 11 | input = pat.sub('', input) | | 12 | pat = re.compile('[1234567890]*\\w*]') #for words like 20th | | 13 | input = pat.sub('', input) | | 14 | input = input.split() #input is now a sequence/list of words | | 15 | misspelled = [] #misspelled is the list of misspelled words | | 16 | index = spellcheck_text(input, index) #index is the index of the next misspelled word | | 17 | | | 18 | while index !=-1: #as long as there is one more misspelled word | | 19 | misspelled.insert(0, input[index]) #add the misspelled word into the misspelled words list | | 20 | index = spellcheck_text(input, index + 1) #obtain the index of the next misspelled word | | 21 | | | 22 | misspelled.sort() | | 23 | misspelled = count(misspelled) #misspelled is a dictionary of keys (words) and values (occurences) | | 24 | sortedmisspelled = [] | | 25 | for key in misspelled.keys(): | | 26 | sortedmisspelled.append(key + ' (' + str(misspelled[key]) + ')' ) #formats the list of words and values | | 27 | sortedmisspelled.sort() #sorts the lists of words and values | | 28 | for element in sortedmisspelled: #prints each word and its value on its own line | | 29 | print element | | 30 | print '\n' |
|