博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
文件方式实现完整的英文词频统计实例
阅读量:6244 次
发布时间:2019-06-22

本文共 2412 字,大约阅读时间需要 8 分钟。

 

1.读入待分析的字符串

  

str='''We don't talk anymoreWe don't talk anymoreWe don't talk anymoreLike we used to do We don't laugh anymoreWhat was all of it for? We don't talk anymore Like we used to doI just heard you found the one you've been lookin'The one you been looking forI wish i would've konwn that wasn't me Cause even after all this time i still wonderWhy i can't move on? Just the way you dance so easliy Don't wanna know The kinda dress you're wearin' tonightIf he's holdin' onto you so tightThe way i did beforeI overdosed Should've known your love was gameNow I can't get'cha out of my brainOoh it's such a shameWe don't talk anymoreWe don't talk anymore We don't talk anymoreLike we used to do We don't laugh anymoreWhat was all of it for?We don't talk anymore Like we used to doI just hope you'r lyin' next to somebodyKnow it's hard to love ya like meMust be a good reason that you're goneEvery now and thenI think you might want me to come show up your doorBut I'm just too afraid that i'll be worngDon't wanna know If you'ra lookin' into her eyesIf she's holdin' onto you so tightThe way i did beforeI overdosedShould've know your love was a game Now I can't get'cha out of my brainOoh it's such a shameWe don't talk anymoreWe don't talk anymoreWe don't talk anymoreLike we used to do We don't laugh anymoreWhat was all of it for? We don't talk anymoreLike we used to doLike we used to doDon't wanna knowThe kinda dress you're wearin' tonightIf he's givin' it to you just rightThe way i did beforeI overdosedShould've know your love was a game Now I can't get'cha out of my brainOoh it's such a shameWe don't talk anymoreWe don't talk anymoreWe don't talk anymoreLike we used to do We don't laugh anymoreWhat was all of it for? We don't talk anymoreLike we used to doWe don't talk anymoreThe way did beforeWe don't talk anymoreOoh WooOoh it's such a shameWe don't talk anymore'''

2.分解提取单词 

3.计数字典

4.排除语法型词汇

5.排序

6.输出TOP(20)

 

fo=open('1.txt','r')str=fo.read()str=str.lower() #转换为小写for i in ',.?':    str=str.replace(i,' ') #用空格代替标点符号    words=str.split(' ')  #分解提取单词exc={
'to','a','of','it',} #选择高频且无效的关键词dic={} keys=set(words) #出现过的单词的集合keys=keys-excprint(words)#排除语法型词汇for i in keys: dic[i]=words.count(i) #计数字典print(dic)wc=list(dic.items()) #列表wc.sort(key=lambda x:x[1],reverse=True)#排序print(wc)for i in range(20): #输出TOP(20) print(wc[i])

运行结果:

 

转载于:https://www.cnblogs.com/nigongbin/p/7598501.html

你可能感兴趣的文章
joisino's travel
查看>>
组合游戏-博弈论中经典模型题目
查看>>
浅谈HTTP的GET和POST
查看>>
点灯笼
查看>>
try{}catch{}
查看>>
[Aaronyang] 写给自己的WPF4.5 笔记11[自定义控件-AyImageButton的过程 1/4]
查看>>
Linux VMware新添加网络适配器找不到配置文件问题
查看>>
Javascript百学不厌 - this
查看>>
机器学习中的数学(1)-回归(regression)、梯度下降(gradient descent)
查看>>
实用算法实现-第 14 篇 启发式搜索
查看>>
c#常用的排序算法
查看>>
论文阅读——Visual inertial odometry using coupled nonlinear optimization
查看>>
Office插件编程[转]
查看>>
读代码还是读文档,来自知乎
查看>>
Linux 常见编译错误
查看>>
ASP.NET MVC 3 Controller
查看>>
Vs中调试MVC源代码步骤
查看>>
JavaScript项目重构到底有多少坑要填要踩
查看>>
footer绝对定位但是不在页面最下边解决方案
查看>>
Oil Deposits(油田)(DFS)
查看>>