问题背景

现有大量的多组数据，每组数据的格式为“id category”，一行有一个十位数左右的id，和一个较短的分类标签。现在输入一个id，要求按照前面标记的数据进行分类。
如下面标记数据:

1234 c1
1235 c2
1236 c3

再输入"1234",则应输出"c1".
用于测试的数据，还有结果比对程序，我放在最下面，50000条测试。

最暴力做法O(n^2)

先把id以string存入string数组，同时把category存入另一个序号相同的string数组。
输入一个id后从前往后找，找到后输出分类。
这种方式适用于初学者，只会c语言。就当看个热闹吧😂

#include<bits/stdc++.h>
using namespace std;
string a[50010];
string name[50010];
int main()
{
    clock_t start,finish;
    double totaltime;
    start=clock();
    //程序
    freopen("in2.txt","r",stdin);
    freopen("out.txt","w",stdout);
    string id,cg;
    int t=0;
    while(cin>>id&&id!="0")
    {
        cin>>cg;
        a[t]=id;
        name[t]=cg;
        t++;
    }
    while(cin>>id)
    {
        for(int i=0;i<t;i++)
        {
            if(id==a[i])
            {
                cout<<name[i]<<endl;
                break;
            }
        }
    }
    //程序
    finish=clock();
    totaltime=(double)(finish-start)/CLOCKS_PER_SEC;
    cout<<"\n此程序的运行时间为"<<totaltime<<"秒！"<<endl;
    return 0;
}

感人的运行时间！

最实用的做法O(nlogn)

稍微学过c++的人应该都知道有个神奇的东西叫stl库，其中内置的map非常好用，map也叫做映射，能将一种数据类型映射到另一个数据类型上，所以这个问题用一个map<string,string>就能解决了。

#include<bits/stdc++.h>
using namespace std;
map<string,string>a;
int main()
{
    clock_t start,finish;
    double totaltime;
    start=clock();
    freopen("in2.txt","r",stdin);
    freopen("out.txt","w",stdout);
    string id,cg;
    while(cin>>id&&id!="0")
    {
        cin>>cg;
        a[id]=cg;
    }
    while(cin>>id)
    {
        cout<<a[id]<<endl;;
    }
    finish=clock();
    totaltime=(double)(finish-start)/CLOCKS_PER_SEC;
    cout<<"\n此程序的运行时间为"<<totaltime<<"秒！"<<endl;
    return 0;
}

又快又短,是不是很刺激？

最快最小题大作的做法O(n)

这个复杂度我也没有仔细算，只是隐隐约约jio的是O(n)。
这个做法用到了trie树的知识，而且花了不少的时间，那为什么花了那么长时间，还有大量计算空间来优化一个logn？
因为~~我想追求极致的速度！！~~算错复杂度了，我一直以为map的查询复杂度是nlogn，然后，就高高兴兴的开始搞了。

有兴趣的可以先了解一下trie树

这里就是把trie树中的字母，换成了数字，结尾的标记换成了分类标记，与上面连接中不同的是，这次我用的是结构体+链表，思路上比较直观，缺点是占用内存较大。

#include<bits/stdc++.h>
using namespace std;
struct node
{
    string name;
    bool flag;
    int book[12];
    struct node *next[12];
    init()
    {
        name="root";
        flag=false;
    }
};
struct node *root,*now,*newp;
void  instert(string id,string cg)
{
    int len=id.size();
    now=root;
    for(int i=0;i<len;i++)
    {
        if(now->book[int(id[i]-'0')]==0)
        {
            newp = new node;//坑点之一，以前不知道，结构体中有指针，单纯的分配空间不行。
            // cout<<int(id[i]-'0')<<endl;
            memset(newp->book,0,sizeof(newp->book));
            newp->flag=false;
            now->next[int(id[i]-'0')]=newp;
            now->book[int(id[i]-'0')]=1;
        }
        now=now->next[int(id[i]-'0')];
    }
    now->name=cg;
    now->flag=true;
}
string check(string id)
{
    int len=id.size();
    now=root;
    for(int i=0;i<len;i++)
    {
        if(now->book[int(id[i]-'0')]==0)
        {
            return "not found";
        }
        if(i==len-1)
        {
            return now->next[int(id[i]-'0')]->name;
        }
        now=now->next[int(id[i]-'0')];
    }
    return "not found";
}
int main()
{
    // clock_t start,finish;
    // double totaltime;
    // start=clock();
    //用于时间测试
    string id,cg;
    freopen("in2.txt","r",stdin);
    freopen("out.txt","w",stdout);
    root=(struct node*)malloc(sizeof(struct node));
    root = new node;//坑点之一，以前不知道，结构体中有指针，单纯的分配空间不行。
    root->init();
    memset(root->book,0,sizeof(root->book));
    while(cin>>id&&id!="0")
    {
        cin>>cg;
        instert(id,cg);
        // cout<<"yes"<<endl;
    }
    while(cin>>id)
    {
        cout<<check(id)<<endl;
    }
    // finish=clock();
    // totaltime=(double)(finish-start)/CLOCKS_PER_SEC;
    // cout<<"\n此程序的运行时间为"<<totaltime<<"秒！"<<endl;
    //用于时间测试
    return 0;
}

结果还算看得过去，只是这点时间上的优化，其实并不是很关键😂。但我相信，只要数据量大起来，优势应该就会体现出来。

总结

感谢杰哥帮我debug，牛逼┗|｀O′|┛ 嗷~~。

话说回来，数据结构与算法也自学一年了，离真正竞赛的水平还有很大差距，但这次离开ACM后的一个小项目中，让我真正感受到了算法与数据结构的魅力与作用，算法是服务于应用的，以前的刷题没有让我感受到这一点，甚至不知道这些算法与数据结构除了做题还可以感谢什么，但现在我真的体会到了。

它能带给我异于常人的思路，更好的解决问题的方法，感谢ACM带给我的一切。

测试用

时间测试程序

#include<iostream.h>
#include<time.h>
void main()
{
clock_t start,finish;
double totaltime;
start=clock();
……                     //把你的程序代码插入到这里面
finish=clock();
totaltime=(double)(finish-start)/CLOCKS_PER_SEC;
cout<<"\n此程序的运行时间为"<<totaltime<<"秒！"<<endl;
}

结果对比程序

#include<bits/stdc++.h>
using namespace std;
string a[50010];
int main()
{
    freopen("out1.txt","r",stdin);
    string cg;
    int t=0;
    while(cin>>cg)
    {
        a[t++]=cg;
    }
    freopen("out2.txt","r",stdin);
    for(int i=0;i<t;i++)
    {
        cin>>cg;
        if(cg!=a[i])
        {
            cout<<"error";
            return 0;
        }
    }
    cout<<"right";
    return 0;
}

测试数据格式：

1324 c1
1234 c8
.
.
.
343452345 c2
0
1324
1234
.
.
.
343452345

为了输入方便我用0，把标记数据和测试数据分开，直接用上面的程序就可以运行。
输出格式：

c1
c8
.
.
.
c2

为了好比对，所以没有打乱顺序。

基于trie树的字符串查找程序

问题背景

最暴力做法O(n^2)

最实用的做法O(nlogn)

最快最小题大作的做法O(n)

总结

测试用

测试数据:https://www.lanzous.com/i5zw5tg

输出文件(用于和自己的输出文件做对比):https://www.lanzous.com/i5zw5uh