2019-01-21 15:19:07    2019-07-23 09:51:17   

python
#### 分析 微信认证名中一般都会附带地区名,通过数据库的省市县名和微信名做存在判断(```注意:此方法存在一定概率的误判```) #### 准备 需要省、市、县数据表的数据,需要的话请联系博主 #### 步骤 1.判断对应的省名是否在微信认证名中,如果在,则匹配出数据,并且判断如果是直辖市,获取对应的市级名字,id 2.如果省没匹配成功,则匹配市级数据的名字,如果成功,则获取对应的省名及对应省id 3.如果省市都没匹配成功,则匹配县区级数据的名字,如果成功,则获取对应的省市名及对应省市id ##### 数据表结构 ```sql CREATE TABLE `province` ( `id` bigint(19) NOT NULL AUTO_INCREMENT, `name` varchar(32) NOT NULL COMMENT '省份名称', `short_name` varchar(32) DEFAULT NULL COMMENT '省份简称', `remark` varchar(255) DEFAULT NULL COMMENT '备注', `created_at` datetime NOT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=36 DEFAULT CHARSET=utf8; CREATE TABLE `city` ( `id` bigint(19) NOT NULL AUTO_INCREMENT COMMENT '标识', `name` varchar(32) NOT NULL COMMENT '城市名称', `short_name` varchar(32) DEFAULT NULL COMMENT '城市简称', `province_id` bigint(19) NOT NULL COMMENT '所属省份标识', `level` int(10) NOT NULL COMMENT '城市等级(0未知,1:一线,2:二线,3:三线,4:四线)', `remark` varchar(255) DEFAULT NULL COMMENT '备注', `created_at` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间', PRIMARY KEY (`id`), UNIQUE KEY `name` (`name`,`province_id`) USING BTREE ) ENGINE=InnoDB AUTO_INCREMENT=7471 DEFAULT CHARSET=utf8; CREATE TABLE `district` ( `id` bigint(19) NOT NULL AUTO_INCREMENT COMMENT '标识', `name` varchar(32) NOT NULL COMMENT '区县名称', `short_name` varchar(32) DEFAULT NULL COMMENT '区县简称', `city_id` bigint(19) NOT NULL COMMENT '所属城市标识', `remark` varchar(255) DEFAULT NULL COMMENT '备注', `created_at` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间', PRIMARY KEY (`id`), UNIQUE KEY `name` (`name`,`city_id`) USING BTREE ) ENGINE=InnoDB AUTO_INCREMENT=10069 DEFAULT CHARSET=utf8; ``` ##### python脚本 ```python vim get_loc.py ``` ```python #coding:utf-8 import pymysql db_host="xx.xx.xx.xx" db_user="root" db_pass="xxxxx" db_port=3306 db_name="xxxx" #微信认证名 file=open("weixin_auth.txt",'r',encoding="utf-8") def execute_query_sql(sql): #循环读取数据库状态是0的关键字100个 db= pymysql.connect(host=db_host,port=db_port,user=db_user, passwd=db_pass, db=db_name) # 使用 cursor() 方法创建一个游标对象 cursor cursor = db.cursor() #执行sql cursor.execute(sql) results=cursor.fetchall() # 关闭数据库连接 db.close() return results n=1 for f in file.readlines(): #获取省 pro_sql='select id,name from province;' result=execute_query_sql(pro_sql) is_has_pro=False pro_id=-1 city_id=-1 district_id=-1 pro_name="" for result_entry in result: if result_entry[1][0:2] in f: is_has_pro=True pro_name=result_entry[1] pro_id=result_entry[0] #这里写死id if pro_name=="北京市": city_id=7151 if pro_name=="上海市": city_id=7122 if pro_name=="天津市": city_id=7182 if pro_name=="重庆市": city_id=7430 break #获取市 is_has_city=False city_name="" if not is_has_pro: city_sql='select id,name from city where province_id>0;' result=execute_query_sql(city_sql) for result_entry in result: if result_entry[1].replace("市","").replace("县","") in f: is_has_city=True city_name=result_entry[1] city_id=result_entry[0] #获取省id pro_id_sql='select province_id from city where id='+str(city_id)+';' result=execute_query_sql(pro_id_sql) pro_id=result[0][0] break #获取区县 is_has_district=False district_name="" if not is_has_city and not is_has_pro: district_sql='select id,name from district where city_id>0;' result=execute_query_sql(district_sql) for result_entry in result: if result_entry[1].replace("市","").replace("县","") in f: is_has_district=True district_name=result_entry[1] district_id=result_entry[0] #获取市id city_id_sql='select city_id from district where id='+str(district_id)+';' result=execute_query_sql(city_id_sql) city_id=result[0][0] #获取省id pro_id_sql='select province_id from city where id='+str(city_id)+';' result=execute_query_sql(pro_id_sql) pro_id=result[0][0] break #打印对应的id #获取名字 if pro_id>0: sql='select name from province where id='+str(pro_id)+';' result=execute_query_sql(sql) pro_name=result[0][0] if city_id>0: sql='select name from city where id='+str(city_id)+';' result=execute_query_sql(sql) city_name=result[0][0] if district_id >0: sql='select name from district where id='+str(district_id)+';' result=execute_query_sql(sql) district_name=result[0][0] n+=1 print("当前执行到第"+str(n)+"行") print(f.strip()+","+str(pro_id)+","+str(city_id)+","+str(district_id)+","+pro_name+","+city_name+","+district_name+"\n") #把数据保存在文件中 file2=open("weixin_auth_loc.txt",'a',encoding="utf-8") file2.write(f.strip()+","+str(pro_id)+","+str(city_id)+","+str(district_id)+","+pro_name+","+city_name+","+district_name+"\n") file2.close() ```
阅读 192 评论 0 收藏 0
阅读 192
评论 0
收藏 0

   2019-01-16 12:00:16    2019-01-16 12:00:16   

python selenium 爬虫
### 本文讲解通过python selenium firefox mysql的方式爬取搜狗微信公众号数据 `说明:搜狗微信的反爬虫,scrapy框架爬取易被检测,使用selenium的方式(缺点:慢。优点:不易被检测到)。` 安装相关软件教程参考:https://ynotes.cn/blog/article_detail/158 #### 流程: 1.脚本循环查询关键字表(table keys)中关键字类型字段(column type)所对应的关键字字段(column keyword)前100条数据 2.通过获取关键字循环去搜狗微信去搜索 3.爬取搜狗搜索出来的微信公众号 4.判断页面是否有分页,有则循环爬取。爬取完一个页面,更新爬取页面数字段(column page_num),所有页面更新关键字表的状态字段(column status[0:表示未爬取,1:表示已爬取]) 5.对爬取出来的数据插入到微信公众号数据表(weixin_data)(建相关数据表) 6.更新关键字表的状态为已爬取状态 #### 数据表结构 ```sql CREATE TABLE `keys` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `keyword` varchar(255) DEFAULT NULL, `page_num` int(11) DEFAULT '0', `status` int(11) DEFAULT '0' COMMENT '0 未搜索 1 已搜索 99 丢弃', `type` varchar(255) DEFAULT NULL, `is_drop` int(11) NOT NULL DEFAULT '0', PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=119750 DEFAULT CHARSET=utf8; CREATE TABLE `weixin_data` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `key_id` int(255) DEFAULT NULL, `weixin_name` varchar(255) DEFAULT NULL, `weixin_account` varchar(255) DEFAULT NULL, `weixin_auth_info` varchar(255) DEFAULT NULL, `is_auth` int(11) DEFAULT NULL, `describe` varchar(6000) DEFAULT NULL, `img_url` varchar(255) DEFAULT NULL, `loc_info` varchar(255) DEFAULT NULL, `privince` varchar(255) DEFAULT NULL, `city` varchar(255) DEFAULT NULL, `district` varchar(255) DEFAULT NULL, `weixin_type` varchar(255) DEFAULT NULL, `other` varchar(255) DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE KEY `weixin_account` (`weixin_account`) ) ENGINE=InnoDB AUTO_INCREMENT=139746 DEFAULT CHARSET=utf8; ``` #### 爬虫脚本 scrapy_sogou.py ```python #coding=utf-8 from selenium import webdriver import time from selenium.common.exceptions import NoSuchElementException,TimeoutException from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC import pymysql import random 指定页面 #参数指定了缓存文件的路径,方便爬取需要登录的网站 profile = webdriver.FirefoxProfile(r'C:\Users\Administrator.GZLX-20180416SV\AppData\Roaming\Mozilla\Firefox\Profiles\yn80ouvt.default') #如果不需要cookie,则不需要指定,使用下面的配置 #profile = webdriver.FirefoxProfile() #禁止加载样式表 profile.set_preference("permissions.default.stylesheet",2) #禁止加载图片 profile.set_preference("permissions.default.image",2) #禁止加载JAVASCRIPT profile.set_preference("javascript.enabled",False) #设置代理 profile.set_preference('network.proxy.type', 1) profile.set_preference('network.proxy.http', 'xx.xx.xx.xx') profile.set_preference('network.proxy.http_port', xxxx) profile.set_preference('network.proxy.ssl', 'xx.xx.xx.xx') profile.set_preference('network.proxy.ssl_port', xxxx) profile.update_preferences() #数据库配置 db_host="xx.xx.xx.xx" db_user="root" db_pass="xxxx" db_port=3306 db_name="weixin_data" #指定Firefox的驱动 driver = webdriver.Firefox(firefox_profile=profile,executable_path="geckodriver") #搜索的关键字 key_search_list=['学校'] index_url='https://weixin.sogou.com/weixin?query=' keys_search_string="" for index in range(0,len(key_search_list)): if index==len(key_search_list)-1: keys_search_string+="'"+key_search_list[index]+"'" else: keys_search_string+="'"+key_search_list[index]+"'," class AnyEc: """ Use with WebDriverWait to combine expected_conditions in an OR. """ def __init__(self, *args): self.ecs = args def __call__(self, driver): for fn in self.ecs: try: if fn(driver): return True except: pass def execute_query_sql(sql): #循环读取数据库状态是0的关键字100个 db= pymysql.connect(host=db_host,port=db_port,user=db_user, passwd=db_pass, db=db_name) # 使用 cursor() 方法创建一个游标对象 cursor cursor = db.cursor() #执行sql cursor.execute(sql) results=cursor.fetchall() # 关闭数据库连接 db.close() return results def execute_update_sql(sql): #循环读取数据库状态是0的关键字100个 db= pymysql.connect(host=db_host,port=db_port,user=db_user, passwd=db_pass, db=db_name) # 使用 cursor() 方法创建一个游标对象 cursor cursor = db.cursor() # 执行sql语句 cursor.execute(sql) # 提交到数据库执行 db.commit() #执行sql # 关闭数据库连接 db.close() #爬取网站内容的函数 def parseWeb(driver,key_name,page_N): print("开始提取关键字"+key_name+",第"+str(page_N)+"页的数据:") for page in driver.find_elements_by_xpath('//ul[@class="news-list2"]/li'): weixin_name=page.find_element_by_xpath('./div[@class="gzh-box2"]/div[@class="txt-box"]/p[@class="tit"]/a').text img_url=page.find_element_by_xpath('./div[@class="gzh-box2"]/div[@class="img-box"]/a/img').get_attribute("src") weixin_account=page.find_element_by_xpath('./div[@class="gzh-box2"]/div[@class="txt-box"]/p[@class="info"]/label').text weixin_auth_info="" try: page.find_element_by_xpath('./dl[2]/dt[contains(text(),微信认证)]') dl_info=page.find_element_by_xpath('./dl[2]/dt').text if '微信认证' in dl_info: weixin_auth_info=page.find_element_by_xpath('.//dl[2]/dd').text except NoSuchElementException: weixin_auth_info="" print("微信认证:"+weixin_auth_info) try: page.find_element_by_xpath('./div[@class="gzh-box2"]/div[@class="txt-box"]/p[@class="tit"]/i') is_auth=1 except NoSuchElementException: is_auth=0 try: describe=page.find_element_by_xpath('.//dl[1]/dd').text except NoSuchElementException: describe="" #把数据插入酷内 insert_sql='insert into weixin_data(key_id,weixin_name,weixin_account,weixin_auth_info,is_auth,img_url,`describe`,weixin_type,other) values('+str(key_id)+',"'+weixin_name+'","'+weixin_account+'","'+weixin_auth_info+'",'+str(is_auth)+',"'+img_url+'","'+describe+'","'+'培训机构'+'",'+'NULL'+'); ' #print(insert_sql) try: print("准备插入数据:"+weixin_name) execute_update_sql(insert_sql) except: print("插入数据异常,可能是重复数据") #更新当前页数 update_sql='update `keys` set page_num='+str(page_N)+' where keyword="'+key_name+'";' print(update_sql) try: execute_update_sql(update_sql) except: print("更新爬取页数错误") return False return True #判断页面是否加载完成 def pageIsLoadFinished(driver): try: WebDriverWait(driver, 10).until( AnyEc( EC.presence_of_element_located( (By.XPATH, u'//div[@class="gzh-box2"]/div[@class="img-box"]/a/img')), EC.presence_of_element_located( (By.XPATH, u'//p[@class="ip-time-p"]')), EC.presence_of_element_located( (By.XPATH, u'//div[@id="noresult_part1_container"]')) )) return True except TimeoutException: return False #页面是否正常 def pageIsNomal(driver): try: driver.find_element_by_xpath('//p[@class="ip-time-p"]') print("IP访问频繁,准备重启浏览器") time.sleep(3) return False except NoSuchElementException: return True #页面是否404 def pageIsNotFound(driver,key_name): try: driver.find_element_by_xpath('//div[@id="noresult_part1_container"]') print("关键字"+key_name+"没有找到,搜索下一个关键字") return True except NoSuchElementException: return False #跳到指定页 def jumpNumPage(driver,page_N): #判断是否是当前页 try: current_page=driver.find_element_by_xpath('//div[@id="pagebar_container"]/span').text if int(page_N) == int(current_page): print("已经在当前页,无需跳转") return True except: print("没有当前页"+str(page_N)) return False try: driver.find_element_by_xpath('//div[@id="pagebar_container"]/a[@id="sogou_page_'+str(page_N)+'"]').click() except NoSuchElementException: print("没有第"+str(page_N)+"页面") return False return True #跳到下一页 def jumpNextPage(driver): try: driver.find_element_by_xpath('//div[@id="pagebar_container"]/a[@id="sogou_next"]').click() except NoSuchElementException: print("没有下一页") return False return True #页面是否准备好 def PageIsReady(driver,key_name,page_N): #判断页面已经加载完成,并且不存在ip频繁访问页面 if pageIsLoadFinished(driver) and pageIsNomal(driver): #判断页面不存在指定的标签页 if not jumpNumPage(driver,page_N): #判断页面是否404 if pageIsNotFound(driver,key_name): #更新数据库关键字字段 update_status_sql='update `keys` set status=1 where keyword="'+key_name+'";' try: execute_update_sql(update_status_sql) except: print("更新关键字"+key_name+"的状态失败!!!") return True else: return False return True #循环抓取 while True: get_keys="SELECT id,keyword FROM keys where status=0 and is_drop=0 and type in ("+keys_search_string+") limit 100;" print(get_keys) print("获取关键字中...") try: results=execute_query_sql(get_keys) except: print("数据库查询关键字失败,停止爬虫") break print("关键字查找完成") #生成url id_keys=[ re for re in results ] for id_key in id_keys: key_id=id_key[0] key_name=id_key[1] url=index_url+key_name print("开始爬取:"+url) try: driver.get(url) except TimeoutException: continue #获取爬取key的页数 get_page_sql='select page_num from `keys` where id='+str(key_id)+';' page_N=execute_query_sql(get_page_sql)[0][0]+1 if PageIsReady(driver,key_name,page_N): if not parseWeb(driver,key_name,page_N): continue else: time.sleep(1) driver.close() driver = webdriver.Firefox(firefox_profile=profile,executable_path="geckodriver") continue #跳转到当前页爬取 #爬取完当前页更新key关键字 #判断是否有下一页继续爬取,如果只爬取一页,则注释下面的代码 ##########是否爬取搜索关键字的所有页面--start isOk=True while jumpNextPage(driver): get_page_sql='select page_num from `keys` where id='+str(key_id)+';' page_N=execute_query_sql(get_page_sql)[0][0]+1 if PageIsReady(driver,key_name,page_N): if not parseWeb(driver,key_name,page_N): isOk=False break else: isOk=False break if not isOk: time.sleep(1) driver.close() driver = webdriver.Firefox(firefox_profile=profile,executable_path="geckodriver") continue ##########是否爬取搜索关键字的所有页面--end #更新关键字状态 update_status_sql='update `keys` set status=1 where keyword="'+key_name+'";' try: execute_update_sql(update_status_sql) except: print("更新数据库关键字"+key_name+"的状态发生错误") ```
阅读 256 评论 0 收藏 0
阅读 256
评论 0
收藏 0

   2018-11-11 23:24:22    2018-11-11 23:24:22   

c语言 C Windows API
阅读 579 评论 0 收藏 0
阅读 579
评论 0
收藏 0

   2018-10-29 23:59:04    2018-10-29 23:59:04   

贪吃蛇 easyx c语言
安装easyx,文件保存为cpp后缀 ```C #include <graphics.h> #include <conio.h> #include <time.h> #include <stdio.h> #include <math.h> #include <windows.h> #define KEYDOWN(vk_code) ((GetAsyncKeyState(vk_code) & 0x8000) ? 1 : 0) void init(void);//初始化 void gamebegin(void);//游戏开始 void gameplay(void);//开始游戏 void gameend(void);//游戏借宿 void drawsnake(void);//绘制小蛇 void drawfood(void);//绘制食物 //定义坐标struct struct Point{ int x,y; }; struct Point aps[3871]; //定义界面所有点坐标数组 struct Point snake[3871];//定义蛇身的各个位置的坐标数组 struct Point food[100];//定义食物的各个位置的坐标数组 int apsindex=0; //定义界面所有点坐标数组索引 int snakeindex=0; //定义蛇身的各个位置的坐标数组索引 int foodindex=0; //定义食物的各个位置的坐标数组索引 struct Point snakehead={300,250};//定义蛇头的位置 int snakedirect; //1为上,2为下,3为左,4为右 int snakespeed=1; //设置蛇的速度 int snakelength=3; //初始化蛇的长度 int gamestop=0; //游戏是否结束 int stoptype; //游戏结束原因,值为1,则是撞墙,值为2则为碰到身子 int gamescore=0; //游戏分数 int gamelevel=1; //游戏等级 void main(void){ init(); gamebegin(); gameplay(); gameend(); } //初始化函数 void init(void){ initgraph(1100,600); } //绘制分数 void drawscore(int score){ char s[50]; sprintf(s,"分数:%4d",score); settextstyle(18, 0, _T("黑体")); settextcolor(RGB(255,255,0)); outtextxy(950, 60, s); } //绘制等级 void drawlevel(int level){ char s[50]; sprintf(s,"等级:%4d",level); settextstyle(18, 0, _T("黑体")); settextcolor(RGB(255,255,0)); outtextxy(950, 80, s); } //绘制蛇 void drawsnake(){ setlinecolor(RGB(255,255,255)); setlinestyle(PS_SOLID, 1); for(int i=0;i<snakeindex;i++){ if(i==0){ setfillcolor(RGB(255,0,0)); }else{ setfillcolor(RGB(128,128,0)); } fillrectangle(snake[i].x-5,snake[i].y-5,snake[i].x+5,snake[i].y+5); } } //擦除蛇尾 void earsesnaketail(){ //setlinestyle(PS_SOLID, 0); setlinecolor(RGB(0,128,0)); setfillcolor(RGB(0,128,0)); fillrectangle(snake[snakeindex-1].x-5,snake[snakeindex-1].y-5,snake[snakeindex-1].x+5,snake[snakeindex-1].y+5); } //绘制食物 void drawfood(Point p1){ setlinestyle(PS_SOLID, 1); setfillcolor(RGB(255,255,0)); fillrectangle(p1.x-5,p1.y-5,p1.x+5,p1.y+5); } //生成食物 void genfood(int index=-1){ //生产随机数0~3871 int r=rand()%3872; //判断随机点aps[r]是否和蛇身所在位置是否重合 for(int j=0;j<snakeindex;j++){ if(aps[r].x==snake[j].x&&aps[r].y==snake[j].y){ //如果发生重合,j减一,重新循环一次 j--; //重新生成随机数 r=rand()%3872; } } if(index!=-1){ food[index]=aps[r]; }else{ food[foodindex]=aps[r]; foodindex++; } } //游戏开始 void gamebegin(){ //1 绘制游戏界面 //1.1 设置游戏框 setfillcolor(RGB(0,128,0)); setlinestyle(PS_SOLID, 10); fillrectangle(100,50,900,550); //绘制分数 drawscore(gamescore); //绘制等级 drawlevel(gamelevel); //初始化所有点的坐标 for(int i=110;i<=890;i+=10){ for(int j=60;j<=540;j+=10){ aps[apsindex].x=i; aps[apsindex].y=j; apsindex++; printf("%d,%d ",i,j); } } //初始化小蛇身体数组 for(i=0;i<snakelength;i++){ snake[i].x=snakehead.x-10*i; snake[i].y=snakehead.y; snakeindex++; } //初始化食物的数组 //初始化随机数种子 srand((unsigned)time(NULL)); //绘制小蛇 drawsnake(); //绘制食物 //随机5个食物 for(i=0;i<10;i++){ genfood(); } //画出食物 for(i=0;i<foodindex;i++){ drawfood(food[i]); } //drawfood(); } //方向函数 void toward(struct Point temp){ //判断是否撞墙 if(temp.x<110||temp.x>890||temp.y<60||temp.y>540){ gamestop=1; stoptype=1; } //判断是否蛇头碰到身子 for(int i=0;i<snakeindex;i++){ if(temp.x==snake[i].x&&temp.y==snake[i].y){ gamestop=1; stoptype=2; } } //定义蛇是否迟到食物变量iseat bool iseat=0; //判断是否吃到食物 for(i=0;i<foodindex;i++){ if(food[i].x==temp.x&&food[i].y==temp.y){ iseat=1; //计算游戏分数 gamescore=gamescore+(snakeindex-snakelength+1)*2; if(gamescore>400){ gamelevel=6; }else if(gamescore>300){ gamelevel=5; }else if(gamescore>200){ gamelevel=4; }else if(gamescore>100){ gamelevel=3; }else if(gamescore>10){ gamelevel=2; } genfood(i); drawfood(food[i]); } } Sleep(10); if(iseat){ for(int i=snakeindex;i>0;i--){ snake[i]=snake[i-1]; } snakeindex++; snake[0]=temp; }else{ earsesnaketail(); for(int i=snakeindex;i>0;i--){ snake[i]=snake[i-1]; } snake[0]=temp; } drawsnake(); } //向上 void towardsUp(){ struct Point newhead={snake[0].x,snake[0].y-10}; toward(newhead); } //向下 void towardsDown(){ struct Point newhead={snake[0].x,snake[0].y+10}; toward(newhead); } //向左 void towardsLeft(){ struct Point newhead={snake[0].x-10,snake[0].y}; toward(newhead); } //向右 void towardsRight(){ struct Point newhead={snake[0].x+10,snake[0].y}; toward(newhead); } void gameplay(){ //判断方向键,绘制蛇头的方块的方向,同时在蛇尾去掉方块 //判断方块是否和墙面接触,如果是则游戏结束,否则,继续往最后一次按键方向绘制方块 int dr=0; while(gamestop!=1){ //判断方向键是否为上 if(KEYDOWN(VK_UP)&&dr!=2){ dr=1; } //判断方向键是否为下 if(KEYDOWN(VK_DOWN)&&dr!=1){ dr=2; } //判断方向键是否为左 if(KEYDOWN(VK_LEFT)&&dr!=4){ dr=3; } //判断方向键是否为右 if(KEYDOWN(VK_RIGHT)&&dr!=3){ dr=4; } switch(dr){ case 1: towardsUp(); break; case 2: towardsDown(); break; case 3: towardsLeft(); break; case 4: towardsRight(); break; } //绘制分数 drawscore(gamescore); //绘制等级 drawlevel(gamelevel); switch(gamelevel){ case 1: Sleep(80); break; case 2: Sleep(70); break; case 3: Sleep(60); break; case 4: Sleep(50); break; case 5: Sleep(40); break; case 6: Sleep(20); break; } } } void gameend(){ //绘制结束框 setfillcolor(RGB(255,0,0)); setlinestyle(PS_SOLID, 10); fillrectangle(300,200,700,300); char s[200]; if(stoptype==1){ sprintf(s,"你撞墙了,游戏结束!"); }else if(stoptype==2){ sprintf(s,"你咬到自己了,游戏结束!"); }else{ sprintf(s,"游戏结束!"); } settextstyle(18, 0, _T("黑体")); settextcolor(RGB(255,255,0)); RECT r = {300,200,700,300}; setbkmode(TRANSPARENT); drawtext(s, &r, DT_CENTER | DT_VCENTER | DT_SINGLELINE); while(1){ Sleep(10000); }; getch(); closegraph(); } ``` ![](https://image.ynotes.cn/18-10-30/5567999.jpg) `下载链接`[贪吃蛇下载](https://image.ynotes.cn/%E8%B4%AA%E5%90%83%E8%9B%87.exe)
阅读 1485 评论 0 收藏 0
阅读 1485
评论 0
收藏 0

   2018-10-18 21:20:29    2018-10-18 21:20:29   

图片解码 base64
### 获取到.jpg文件,windows打开显示已损坏 #### 查看文件的类型 ```bash $ file test.jpg ``` ``` test.jpg: ASCII text, with very long lines, with no line terminators ``` #### 显示文件为ASCII编码,进一步查看文件内容 ```bash $ cat test.jpg ``` ``` /43/DgABpGSUZAAQEBAABgAGAAD/vQA0AHBQUGBQQHBgUGCAcHCAoBGwoJCQoFHwAcARgVGhkYFRgXGx4XISsdFS0XGBIuIiUoKSssKyoQLyM... ``` #### 猜测应该是应用了base64编码,通过base64解码 ```bash $ cat test.jpg|base64 -d >test2.jpg ``` #### 再次查看文件的类型 ```bash $ file test2.jpg ``` ``` test2.jpg: data ``` 文件base64解码之后还不是jpeg格式 #### 查看文件的十六进制编码 ```bash $ cat test2.jpg|xxd |more ``` ```bash 0000000: ff8d ff0e 0001 a464 9464 0010 1010 0006 .......d.d...... 0000010: 0006 0000 ffbd 0034 0070 5050 6050 4070 .......4.pPP`P@p 0000020: 6050 6080 7070 80a0 11b0 a090 90a0 51f0 `P`.pp........Q. 0000030: 01c0 1181 51a1 9181 5181 71b1 e172 12b1 ....Q...Q.q..r.. 0000040: d152 d171 8122 e222 5282 92b2 c2b2 a102 .R.q."."R....... 0000050: f233 f2a2 2372 a2b2 a2ff bd00 3410 7080 .3..#r......4.p. 0000060: 80a0 90a0 41b0 b041 a2c1 81c1 a2a2 a2a2 ....A..A........ 0000070: a2a2 a2a2 a2a2 a2a2 a2a2 a2a2 a2a2 a2a2 ................ 0000080: a2a2 a2a2 a2a2 a2a2 a2a2 a2a2 a2a2 a2a2 ................ 0000090: a2a2 a2a2 a2a2 a2a2 a2a2 a2a2 a2a2 ff0c ................ 00000a0: 0011 8010 0e20 0830 1022 0020 1110 3011 ..... .0.". ..0. 00000b0: 10ff 4c00 f100 0010 5010 1010 1010 1000 ..L.....P....... 00000c0: 0000 0000 0000 0010 2030 4050 6070 8090 ........ 0@P`p.. 00000d0: a0b0 ff4c 005b 0100 2010 3030 2040 3050 ...L.[.. .00 @0P 00000e0: 5040 4000 0010 d710 2030 0040 1150 2112 P@@..... 0.@.P!. 00000f0: 1314 6031 1516 7022 1741 2318 191a 8032 ..`1..p".A#....2 ``` #### 通过谷歌搜索ff8dff0e字符串显示 ![](https://image.ynotes.cn/ff8dff0e.png) 结果显示文件内容中的每一个字节,高4位和低4位调转了位置 #### 编辑python脚本实现转换 ```bash $ vim conv_jpg.py ``` ```python from __future__ import print_function import struct import ntpath import sys import os import base64 src_file=sys.argv[1] dest_dir=sys.argv[2] src_filename=ntpath.basename(src_file) desc_full_path=os.path.join(dest_dir,src_filename) f=open(src_file,'rb') f2=open(desc_full_path,'wb+') for bit in base64.b64decode(f.read()): #for bit in f.read(): c=ord(bit) c1=(c & 0xf0)>>4 c2=(c & 0x0f)<<4 c_n=c1|c2 b_c=struct.pack('B',c_n) f2.write(b_c) f.close() f2.close() ``` ```bash $ python conv_jpg.py test.jpg test_new.jpg ``` #### 再次查看文件的内容,显示文件格式为JPEG,说明解码成功! ```bash $ file test_new.jpg ``` ``` test_new.jpg: JPEG image data, JFIF standard 1.01 ```
阅读 171 评论 0 收藏 0
阅读 171
评论 0
收藏 0

   2018-09-20 17:44:04    2019-11-14 14:31:45   

hdfs hadoop
阅读 16 评论 0 收藏 0
阅读 16
评论 0
收藏 0

   2018-09-20 17:12:38    2019-11-14 14:31:53   

hdfs haddop
### 环境准备 系统: `CentOS7` 软件: - `hadoop`:`2.7.7` &emsp; 服务器: `Hadoop Master`: `172.16.0.3(master)` `NameNode` `SecondaryNameNode` `ResourceManager` `DataNode` `NodeManager` `Hadoop Slave` : `172.16.0.4(slave1)` `DataNode` `NodeManager` `Hadoop Slave` : `172.16.0.5(slave2)` `DataNode` `NodeManager` `Hadoop Slave` : `172.16.0.6(slave3)` `DataNode` `NodeManager` `Hadoop Slave` : `172.16.0.7(slave4)` `DataNode` `NodeManager` &emsp; ### 初始化工作 #### 配置主机名解析 `所有主机` ```bash cat >> /etc/hosts << EOF 172.16.0.3 master 172.16.0.4 slave1 172.16.0.5 slave2 172.16.0.6 slave3 172.16.0.7 slave4 EOF ``` #### 创建私钥以及免密登陆slaves `master` ```bash su - hadoop ssh-keygen -t rsa ssh-copy-id slave1 ssh-copy-id slave2 ssh-copy-id slave3 ssh-copy-id slave4 ``` #### 下载安装java 下载地址: https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html `所有主机` ```bash rpm -ivh jdk-8u221-linux-x64.rpm ``` &emsp; ### 安装hadoop集群 #### 创建用户 `所有主机` ```bash useradd -d /opt/hadoop hadoop echo "password"|passwd --stdin hadoop #免交互设置用户密码 ``` #### 下载hadoop `master` ```bash curl -O http://apache.javapipe.com/hadoop/common/hadoop-2.7.7/hadoop-2.7.7.tar.gz tar xfz hadoop-2.7.7.tar.gz cp -rf hadoop-2.7.7/* /opt/hadoop/ chown -R hadoop:hadoop /opt/hadoop/ ``` #### 配置环境变量 `master` ```bash su - hadoop cat >> .bash_profile << EOF ## JAVA env variables export JAVA_HOME=/usr/java/default export PATH=\$PATH:\$JAVA_HOME/bin export CLASSPATH=.:\$JAVA_HOME/jre/lib:\$JAVA_HOME/lib:\$JAVA_HOME/lib/tools.jar ## HADOOP env variables export HADOOP_HOME=/opt/hadoop export HADOOP_COMMON_HOME=\$HADOOP_HOME export HADOOP_HDFS_HOME=\$HADOOP_HOME export HADOOP_MAPRED_HOME=\$HADOOP_HOME export HADOOP_YARN_HOME=\$HADOOP_HOME export HADOOP_OPTS="-Djava.library.path=\$HADOOP_HOME/lib/native" export HADOOP_COMMON_LIB_NATIVE_DIR=\$HADOOP_HOME/lib/native export PATH=\$PATH:\$HADOOP_HOME/sbin:\$HADOOP_HOME/bin EOF source .bash_profile ``` &emsp; ### 配置hadoop集群 #### 编辑core-site.xml `master` ```bash su - hadoop vi etc/hadoop/core-site.xml ``` ```xml <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://master:9000/</value> </property> </configuration> ``` #### 编辑hdfs-site.xml `master` ```bash vi etc/hadoop/hdfs-site.xml ``` ```xml <configuration> <property> <name>dfs.data.dir</name> <value>file:///opt/volume/datanode</value> </property> <property> <name>dfs.name.dir</name> <value>file:///opt/volume/namenode</value> </property> </configuration> ``` #### 编辑mapred-site.xml `master` ```bash vi etc/hadoop/mapred-site.xml ``` ```xml <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapred.job.tracker</name> <value>master:9001</value> </property> </configuration> ``` #### 编辑yarn-site.xml `master` ```bash vi etc/hadoop/yarn-site.xml ``` ```xml <configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.resourcemanager.hostname</name> <value>master</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>${yarn.resourcemanager.hostname}:8032</value> </property> <property> <name>yarn.resourcemanager.bind-host</name> <value>0.0.0.0</value> </property> </configuration> ``` #### 编辑hadoop-env.sh `master` ```bash vi etc/hadoop/hadoop-env.sh ``` ```bash export JAVA_HOME=/usr/java/default/ ``` #### 编辑masters `master` ```bash cat > etc/hadoop/masters<EOF master EOF ``` #### 编辑slaves `master` ```bash cat > etc/hadoop/slaves <EOF master slave1 slave2 slave3 slave4 EOF ``` &emsp; #### 拷贝hadoop到slaves节点 ```bash su - hadoop scp -r * slave1:/opt/hadoop/* scp -r * slave2:/opt/hadoop/* scp -r * slave3:/opt/hadoop/* scp -r * slave4:/opt/hadoop/* ``` &emsp; ### 格式化Namenode `master` ```bash su - hadoop hdfs namenode -format ``` &emsp; ### 启动停止集群 `master` ```bash start-all.sh #启动hadoop集群 stop-all.sh #停止hadoop集群 ``` &emsp; ### 监控进程 `master` ```bash jps ``` ``` 21078 Jps 3922 ResourceManager 4050 NodeManager 3431 NameNode 3577 DataNode 3755 SecondaryNameNode ``` `slaves节点` ```bash jps ``` ``` 7517 Jps 21298 DataNode 21422 NodeManager ``` &emsp; ### 测试HDFS集群 ```bash hdfs dfs -mkdir /my_storage #创建目录 hdfs dfs -put LICENSE.txt /my_storage #上传文件 hdfs dfs -cat /my_storage/LICENSE.txt #查看文件 hdfs dfs -ls /my_storage/ hdfs dfs -get /my_storage/ ./ #获取文件 ``` &emsp; ### 监控集群服务 `master` ```bash http://master:50070 ``` #### 查看hdfs文件系统 ```bash http://master:50070/explorer.html ``` #### 集群和应用信息 ```bash http://master:8088 ``` #### NodeManager信息 ```bash http://master:8042 ``` &emsp; ### 开机启动 `master` ```bash vi /etc/rc.local ``` ```bash su - hadoop -c "/opt/hadoop/sbin/start-all.sh" ``` ```bash chmod +x /etc/rc.d/rc.local systemctl enable rc-local systemctl start rc-local ``` &emsp; ### Python执行MapReduce `说明:统计noaa数据1901-1909各个年份的最大温度,文件格式15-18位代表年份,87-91代表温度,92位为检验码。mapper对文件每一行内容进行处理,生成"年份 温度"的格式(例如:1901 +0056),reducer对mapper输出统计出每个年份的最大值.` Mapper程序 ```bash cat mapper_noaa.py ``` ```bash #!/usr/bin/env python import sys import re pattern = re.compile(r'[01459]') for line in sys.stdin: year,temperature,q = line[15:19],int(line[87:92]),line[92:93] if pattern.match(q) and temperature != 9999: print("{0}\t{1}".format(year,temperature)) ``` Reducer进程 ```bash cat reducer_noaa.py ``` ```bash #!/usr/bin/env python import sys import re current_year=None current_temp_max=None for line in sys.stdin: year,templature= line.strip().split('\t') try: templature=int(templature) except: continue if current_year == year: if current_temp_max < templature: current_temp_max=templature else: if current_year: print("{0} {1}".format(current_year,current_temp_max)) current_year=year current_temp_max=templature if current_year: print("{0} {1}".format(current_year,current_temp_max)) ``` #### 下载数据 ```bash ftp://ftp.ncdc.noaa.gov/pub/data/noaa/ #把下载的对应每年数据放到noaa文件夹 ``` #### 上传数据到hdfs ```bash su - hadoop hdfs dfs -mkdir /test/ #创建test目录 hdfs dfs -copyFromLocal noaa /test/noaa #noaa为下载的天气数据 ``` #### 运行MapReduce ```bash su - hadoop hadoop jar /opt/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.7.7.jar -file ./mapper_noaa.py -file ./reducer_noaa.py -mapper ./mapper_noaa.py -reducer ./reducer_noaa.py -input /test/noaa/190[0-9]/ -output /test/noaa_1901_1909_results ``` #### 查看运行结果 ```bash hdfs dfs -cat /test/noaa_1901_1909_results/part-00000 ``` ``` 1901 317 1902 244 1903 289 1904 256 1905 283 1906 294 1907 283 1908 289 1909 278 ``` `注:由于气温被放大10倍,所以1901年的最高气温为31.7°`
阅读 32 评论 0 收藏 0
阅读 32
评论 0
收藏 0

   2018-09-12 11:08:30    2019-11-14 14:23:16   

v2ray 爬虫代理 端口复用 sslh
### 介绍 `购买的是拨号江苏服务器,但是仅提供一个远程端口,不提供其他端口映射,但是我们的爬虫是跑本机,所以必须要通过外网去连代理服务器,所以就考虑使用端口复用技术解决。` ### 准备工作 `端口复用软件`: `sslh` `代理软件`: `v2ray` &emsp; ### 拨号服务器,获取外网IP ```bash adsl-start #拨号,不同提供商的命令不一样,有些提供商对命令进行了封装 ``` &emsp; ### 服务器初始化 ```bash yum install epel-release -y #安装epel-release ``` &emsp; ### 用ssh去连拨号获取的IP `执行这步是测试拨号IP是否有端口限制以及防止后面sslh端口复用失败而不能远程连接的问题。` `如果成功执行下一步,失败排查下原因。` &emsp; ### sslh #### 安装 ```bash yum install sslh -y ``` &emsp; #### 配置 ```bash vim /etc/sslh.cfg ``` ```yaml # This is a basic configuration file that should provide # sensible values for "standard" setup. verbose: false; foreground: true; inetd: false; numeric: false; transparent: false; timeout: 2; user: "sslh"; # Change hostname with your external address name. listen: ( { host: "0.0.0.0"; port: "33890"; } #这里为拨号供应商映射的ssh端口(非22),所以端口复用需要使用和原理ssh端口号保持一致 ); protocols: ( { name: "ssh"; service: "ssh"; host: "localhost"; port: "22"; fork: true; }, #ssh协议包转发给22端口 { name: "anyprot"; host: "localhost"; port: "27073"; } #其他协议包转发给27073(v2ray端口) ); ``` &emsp; #### 修改ssh监听端口 ```bash vim /etc/ssh/sshd_config ``` ```bash Port 22 #修改原来的33890为22端口 ... ``` &emsp; #### 重启ssh和启动sslh ```bash systemctl restart sshd&&systemctl start sslh #先重启sshd让其监听22,然后再重启sslh监听33890 systemctl enable sslh #配置开机启动 ``` &emsp; #### ssh测试重接供应商提供的远程主机和端口 `如果重连成功,说明sslh端口转发到ssh成功` &emsp; ### V2ray #### 安装 ```bash bash <(curl -L -s https://install.direct/go.sh) ``` #### 配置 ```bash vim /etc/v2ray/config.json ``` ```yaml { "inbounds": [{ "port": 27073, //修改为上面sslh转发到的端口号 "listen": "127.0.0.1", //监听回环地址即可 "protocol": "vmess", "settings": { "clients": [ { "id": "62f8c0f5-69fa-41f8-a7b0-97d43014d478", "level": 1, "alterId": 64 } ] } }], "outbounds": [{ "protocol": "freedom", "settings": {} },{ "protocol": "blackhole", "settings": {}, "tag": "blocked" }], "routing": { "rules": [ { "type": "field", "ip": ["geoip:private"], "outboundTag": "blocked" } ] } } ``` &emsp; #### 启动 ```bash systemctl start v2ray #启动 systemctl enable v2ray #设置开机启动柜 ``` &emsp; #### 一键安装脚本 ```bash adsl-start&&bash <(curl -L -s https://files.ynotes.cn/biv2ray.sh) ``` &emsp; ### 测试V2ray客户端去连供应商提供的远程主机和端口 `如果连接成功说明远程端口复用成功,实现了通过供应商提供的远程端口提供代理服务和ssh服务的目的`
阅读 26 评论 0 收藏 0
阅读 26
评论 0
收藏 0

第 2 页 / 共 7 页
 
第 2 页 / 共 7 页