键值对 « 我的天

2020-04

字典还能这样玩！

By xrspook @ 18:30:55 归类于: 扮IT

一开始，我自己写的脚本能运行，但慢到怀疑人生。吃了个饭，折腾了半个小时后，字母表才处理到b而已，显然这是个失败的操作。我的做法是常规地为词汇表建立字典，然后历遍字典里的每个单词，单词进入函数后跟字典的另一个单词比较，比较方法是把单词（即字符串）打散为字符列表然后排列，如果排列一致，且被比较的单词小于拿去比的单词，它们就是一伙的，贴在被比较的单词列表下。列表长度大于2就返回列表然后打印。这样是可以选出异构词的，但非常非常慢！

看过参考答案之后我跳起来了，他们用了一句”.join(lists)，这等于是把列表str重新粘成一个字符串，我那个去！他们把单词用列表打散重排再粘回去，最关键的是，这个唯一的重排字符串他们在建立字典的时候就作为key，所有与之有一样字符的全部被看作小弟被放置这个键的键值里。字典还是字典，但字典的键成了规则字符串，键值则是排列组合过的词汇表。我根本没想到啊，怎么可能想得到呢！！！！！

题目要求倒序打印，然后要求找出能组成最多异构词的8个字母。但实际上参考答案的输出问非所答，比如没有倒序，比如只是把8个字母的异构词摆出来，没确切告诉你最多的是什么。

Exercise 2: More anagrams! Write a program that reads a word list from a file (see Section 9.1) and prints all the sets of words that are anagrams. Here is an example of what the output might look like:
[‘deltas’, ‘desalt’, ‘lasted’, ‘salted’, ‘slated’, ‘staled’]
[‘retainers’, ‘ternaries’]
[‘generating’, ‘greatening’]
[‘resmelts’, ‘smelters’, ‘termless’]
Hint: you might want to build a dictionary that maps from a collection of letters to a list of words that can be spelled with those letters. The question is, how can you represent the collection of letters in a way that can be used as a key? Modify the previous program so that it prints the longest list of anagrams first, followed by the second longest, and so on. In Scrabble a “bingo” is when you play all seven tiles in your rack, along with a letter on the board, to form an eight-letter word. What collection of 8 letters forms the most possible bingos? Solution: http://thinkpython2.com/code/anagram_sets.py.

from time import time
def sorted_anagram(d):
    l = []
    for key in d:
        if len(d[key]) > 1:
            l.append((len(d[key]), d[key])) # 这是个由列表创建的元组？
    return sorted(l, reverse = True) # 倒序神马真折腾
def eight_letters(d, num):
    global length # 全局变量都用上了，就为了记录个最大值
    new_l = []
    for key in d:
        if len(key) == num and len(d[key]) > 1:
           new_l.append((len(d[key]), d[key]))
           if len(d[key]) >= length:
               length = len(d[key])
    return sorted(new_l)
def sorted_letters(word):
    list_word = sorted(list(word)) # 先把字符串打散为字符列表，然后排序
    reword =''.join(list_word) # 再把字符列表回粘成字符串
    return reword
def set_dict(fin):
    d = {}
    for line in fin:
        word = line.strip()
        reword = sorted_letters(word) # 打散重排相当关键，必须在建立字典时就做！！！
        if reword not in d:
            d[reword] = [word] # 字典的键已经不是单词，是纯粹的规律字符串
        else:
            d[reword].append(word) # 字典的键值才是词汇表里的单词
    return d
fin = open('words.txt')
length = 0
count = 0
start = time()
d = set_dict(fin)
for item in sorted_anagram(d):
    print(item)
    count += 1
print(count)
for item in eight_letters(d, 8):
    if item[0] == length:
        print(item)
end = time()
print(end - start)
# ......
# (2, ['abacas', 'casaba'])
# (2, ['aba', 'baa'])
# (2, ['aals', 'alas'])
# (2, ['aal', 'ala'])
# (2, ['aahed', 'ahead'])
# (2, ['aah', 'aha'])
# 10157 # 全体异构词
# (7, ['angriest', 'astringe', 'ganister', 'gantries', 'granites', 'ingrates', 'rangiest'])
# 异构词最多的8字母单词（共7个异构词）
# 0.6079998016357422

标签：Think Python, 习题, 列表, 合并, 字典, 字符串, 扮IT, 脚本, 键值对

还没有评论

我的天

字典还能这样玩！

戳这只鬼

随机日志

我的天

字典还能这样玩！

戳这只鬼

标签云了

随机日志