[LeetCode] Design Search Autocomplete System 设计搜索自动补全系统

Design a search autocomplete system for a search engine. Users may input a sentence (at least one word and end with a special character '#'). For each character they type except '#', you need to return the top 3historical hot sentences that have prefix the same as the part of sentence already typed. Here are the specific rules:

  1. The hot degree for a sentence is defined as the number of times a user typed the exactly same sentence before.
  2. The returned top 3 hot sentences should be sorted by hot degree (The first is the hottest one). If several sentences have the same degree of hot, you need to use ASCII-code order (smaller one appears first).
  3. If less than 3 hot sentences exist, then just return as many as you can.
  4. When the input is a special character, it means the sentence ends, and in this case, you need to return an empty list.

Your job is to implement the following functions:

The constructor function:

AutocompleteSystem(String[] sentences, int[] times): This is the constructor. The input is historical data. Sentences is a string array consists of previously typed sentences. Times is the corresponding times a sentence has been typed. Your system should record these historical data.

Now, the user wants to input a new sentence. The following function will provide the next character the user types:

List<String> input(char c): The input c is the next character typed by the user. The character will only be lower-case letters ('a' to 'z'), blank space (' ') or a special character ('#'). Also, the previously typed sentence should be recorded in your system. The output will be the top 3 historical hot sentences that have prefix the same as the part of sentence already typed.

Example:

Operation: AutocompleteSystem(["i love you", "island","ironman", "i love leetcode"], [5,3,2,2]) 
The system have already tracked down the following sentences and their corresponding times: 
"i love you" : 5 times 
"island" : 3 times 
"ironman" : 2 times 
"i love leetcode" : 2 times 
Now, the user begins another search: 
Operation: input('i') 
Output: ["i love you", "island","i love leetcode"] 
Explanation: 
There are four sentences that have prefix "i". Among them, "ironman" and "i love leetcode" have same hot degree. Since ' ' has ASCII code 32 and 'r' has ASCII code 114, "i love leetcode" should be in front of "ironman". Also we only need to output top 3 hot sentences, so "ironman" will be ignored. 
Operation: input(' ') 
Output: ["i love you","i love leetcode"] 
Explanation: 
There are only two sentences that have prefix "i ". 
Operation: input('a') 
Output: [] 
Explanation: 
There are no sentences that have prefix "i a". 
Operation: input('#') 
Output: [] 
Explanation: 
The user finished the input, the sentence "i a" should be saved as a historical sentence in system. And the following input will be counted as a new search. 

Note:

    1. The input sentence will always start with a letter and end with '#', and only one blank space will exist between two words.
    2. The number of complete sentences that to be searched won't exceed 100. The length of each sentence including those in the historical data won't exceed 100.
    3. Please use double-quote instead of single-quote when you write test cases even for a character input.
    4. Please remember to RESET your class variables declared in class AutocompleteSystem, as static/class variables are persisted across multiple test cases. Please see here for more details.

这道题让我们实现一个简单的搜索自动补全系统,我们用谷歌或者百度进行搜索时,会有这样的体验,输入些单词,搜索框会弹出一些以你输入为开头的一些完整的句子供你选择,这就是一种搜索自动补全系统。根据题目的要求,补全的句子是按之前出现的频率排列的,高频率的出现在最上面,如果频率相同,就按字母顺序来显示。输入规则是每次输入一个字符,然后返回自动补全的句子,如果遇到井字符,表示完整句子结束。那么我们肯定需要一个哈希map,建立句子和其出现频率的映射,还需要一个字符串data,用来保存之前输入过的字符。在构造函数中,给了我们一些句子,和其出现的次数,那么我们就直接将其加入哈希map,然后data初始化为空字符串。在input函数中,我们首先判读输入字符是否为井字符,如果是的话,那么表明当前的data字符串已经是一个完整的句子,在哈希表中次数加1,并且data清空,返回空集。否则的话我们将当前字符加入data字符串中,现在就要找出包含data前缀的前三高频句子了,我们使用优先队列来做,设计的思路是,始终用优先队列保存频率最高的三个句子,那么我们就应该把频率低的或者是字母顺序大的放在队首,以便随时可以移出队列,所以应该是个最小堆,队列里放句子和其出现频率的pair,并且根据其频率大小进行排序,所以我们要重写优先队列的comparator。然后我们遍历哈希表中的所有句子,我们首先要验证当前data字符串是否是其前缀,没啥好的方法,就逐个字符比较,用标识符matched,初始化为true,如果发现不匹配,则matched标记为false,并break掉。然后判断如果matched为true的话,说明data字符串是前缀,那么就把这个pair加入优先队列中,如果此时队列中的元素大于三个,那把队首元素移除,因为我们设计的是最小堆,所以频率小的句子会被先移除。然后就是将优先队列的元素加到结果res中,由于先出队列的是频率小的句子,所以要加到结果res的末尾,参见代码如下:

class AutocompleteSystem {

public:
    AutocompleteSystem(vector<string> sentences, vector<int> times) {
        for (int i = 0; i < sentences.size(); ++i) {
            freq[sentences[i]] += times[i];
        }
        data = "";
    }

    vector<string> input(char c) {
        if (c == '#') {
            ++freq[data];
            data = "";
            return {};
        }
        data.push_back(c);
        auto cmp = [](pair<string, int>& a, pair<string, int>& b) {
            return a.second > b.second || (a.second == b.second && a.first < b.first);
        };
        priority_queue<pair<string, int>, vector<pair<string, int>>, decltype(cmp) > q(cmp);
        for (auto f : freq) {
            bool matched = true;
            for (int i = 0; i < data.size(); ++i) {
                if (data[i] != f.first[i]) {
                    matched = false;
                    break;
                }
            }
            if (matched) {
                q.push(f);
                if (q.size() > 3) q.pop();
            }
        }
        vector<string> res(q.size());
        for (int i = q.size() - 1; i >= 0; --i) {
            res[i] = q.top().first; q.pop();
        }
        return res;
    }

private:
    unordered_map<string, int> freq;
    string data;
};

参考资料:

https://discuss.leetcode.com/topic/96090/straight-forward-hash-table-priority-queue-solution-in-c-no-trie

本文转自博客园Grandyang的博客,原文链接:[LeetCode] Design Search Autocomplete System 设计搜索自动补全系统

,如需转载请自行联系原博主。

时间: 2024-09-20 14:36:37

[LeetCode] Design Search Autocomplete System 设计搜索自动补全系统的相关文章

[LeetCode] Design Log Storage System 设计日志存储系统

You are given several logs that each log contains a unique id and timestamp. Timestamp is a string that has the following format: Year:Month:Day:Hour:Minute:Second, for example, 2017:01:01:23:59:59. All domains are zero-padded decimal numbers. Design

使用Bootstrap typeahead插件实现搜索框自动补全的方法_javascript技巧

这就是贴代码的坏处之一:搜索框快被网友玩儿坏了!!!有故意输入空格的,有输入or 1=1的,有alert的,有html乱入的.......而且好像还在玩儿,随他们去吧,只要开心就好. 在项目中,经常会用到输入框的自动补全功能,就像百度.淘宝等搜索框一样:当用户输入首字母.关键词时,后台会迅速将与此相关的条目返回并显示到前台,以便用户选择,提升用户体验.当然本项目的补全功能和这些大厂的技术是没有可比性的,但用于站内搜索也是绰绰有余了. 接触到的自动补全插件主要有两个:autocomplete和ty

详解jQuery UI库中文本输入自动补全功能的用法_jquery

自动补全(autocomplete),是一个可以减少用户输入完整信息的UI 工具.一般在 输入邮箱.搜索关键字等,然后提取出相应完整字符串供用户选择. 一.调用autocomplete()方法 $('#email').autocomplete({ source : ['aaa@163.com', 'bbb@163.com', 'ccc@163.com'], }); 二.修改autocomplete()样式   由于autocomplete()方法是弹窗,然后鼠标悬停的样式.通过Firebug 想

[LeetCode] Design Excel Sum Formula 设计Excel表格求和公式

Your task is to design the basic function of Excel and implement the function of sum formula. Specifically, you need to implement the following functions: Excel(int H, char W): This is the constructor. The inputs represents the height and width of th

emac-Emacs 自动补全 auto-complete yasnippet 光标空白处不显示

问题描述 Emacs 自动补全 auto-complete yasnippet 光标空白处不显示 我在ubuntu中配置了emacs 的自动补全,现在碰到一个问题,在出现自动补全的时候,光标在有字符的地方会闪烁,在没有字符或者空白处无法看到光标,请问怎么让光标都在空白处也显示 下面的是我自动补全的配置 ;; yasnippet (add-to-list 'load-path "~/.emacs.d/yasnippet-0.6.1c") (require 'yasnippet);; no

关于搜索框的自动补全问题

问题描述 在做搜索框的自动补全功能时,当打开输入法的情况下,敲击键盘,屏幕上出现输入法的中文提示,搜索框中也出现了相应的拼音,但是在没有敲击空格输入中文前,搜索框下方不会出现提示框,请问这个问题怎么解决??? 解决方案 这很正常吧.你还没有输入文字呢,怎么会显示自动补全框呢!解决方案二:还没输出,就不会触发匹配事件.除非程序在输入法打开情况下也能监控输入.

【技术贴】MyEclipse打出syso代码不能自动补全补全不能输出system.out.print

在MyEclipse6.0甚至更高的6.5GA版本中的快捷键中把我们习惯性使用的Alt+/进行代码自动补齐 但是由于于之前版本有快捷键有冲突,所以总之不能自动提示 以下是解决方法 方法如下: 1.选择MyEclipse6.X菜单栏中的Window->preferences: 2.选择General->keys; 3.在右侧中间的窗体中点击word completion后再点击remove binding,在下方的binding中随便输入一个快捷键: 4.然后选择Content Assist点击

《jQuery、jQuery UI及jQuery Mobile技巧与示例》——7.2 技巧:使用自动补全微件提示输入值

7.2 技巧:使用自动补全微件提示输入值 在一些网站上,你可以找到用于选择的下拉菜单,它们包含了极长的选项列表.在许多情况下,可以使用具有自动补全功能的输入框取代下拉式菜单来帮助用户.省去了滚动选择,用户只要输入目标选项的第一个字符,然后自动补全组件便可以完成剩下的事. 代码清单7-2提供了一个自动补全的例子,它使用一段称为"Lorem Ipsum"的文字来实现输入第一个字后的补全.这段文字起源于两千年前,但仍然使用在图形设计和排版行业(通常被称为"假文"或&quo

Bootstrap3使用typeahead插件实现自动补全功能_javascript技巧

很酷的一个自动补全插件 http://twitter.github.io/typeahead.js 在bootstrap中使用typeahead插件,完成自动补全 相关的文档:https://github.com/twitter/typeahead.js/blob/master/doc/jquery_typeahead.md 数据源: Local:数组 prefectch:json remote等方式 -----------------------------------------------