第七题(综合):
难点分析:
1.无限debugger ,右键(never pause here) 跳过debugger
2.发现post的参数中xl加密,headers中有m和ts(时间戳)加密
3.获取到的数据需要解密
总的来说就是结合了前一次的大部分内容来综合练习
跟栈后发现加密的地方在这里
ts是时间戳,m经过分析后属于是md5加密,原因如下
之后是解密部分
首先想到的是AES的解密部分,调试后发现可行,就直接用了
代码部分
import binascii
import json
import requests
from hashlib import md5,sha256
import time
from Crypto.Cipher import AES
#密钥 iv 偏移量
from Crypto.Util.Padding import unpad
def decrypt_datas(text):
key = b'xxxxxxxxoooooooo'
iv = b'0123456789ABCDEF'
criphertext=binascii.unhexlify(text)
cripto= AES.new(key=key,mode=AES.MODE_CBC,iv=iv) #选定加密方法
text_decrypt=cripto.decrypt(criphertext) #解密
text_upad=unpad(text_decrypt,block_size=AES.block_size).decode() #打开包
return text_upad
cookies = {
}
sum_data=0;
for i in range(1,21):
time.sleep(1)
time_stamp=str(int(time.time()*1000))
encrypt_string='xialuo'+time_stamp
obj_md5=md5()
text=encrypt_string.encode('utf-8')
obj_md5.update(text)
update_text=obj_md5.hexdigest()
before_encyrpt_x=(update_text+'xxoo').encode('utf-8')
obj_sha256=sha256()
obj_sha256.update(before_encyrpt_x)
x=obj_sha256.hexdigest()
params = {
'page': i,
'x': x,
}
headers = {
'm': update_text,
'ts': time_stamp
}
response = requests.get('https://www.mashangpa.com/api/problem-detail/7/data/', params=params, cookies=cookies, headers=headers)
response.raise_for_status()
response.encoding = 'utf-8'
sum_data += sum(json.loads(decrypt_datas(text=str(response.json()['r'])))['current_array'])
print(i)
print(json.loads(decrypt_datas(text=str(response.json()['r'])))['current_array'])
print(sum_data)
第八题(混淆):
难点分析:
1.无限debugger ,右键(never pause here) 跳过debugger
2.跟栈的时候发现代码有很多数字和字母组成的函数名(明显是混淆)
3.依旧是headers中m和ts是变化的
针对第二点这里进行解混淆操作 ob解混淆推荐爬虫工具-爬虫分析工具-猿人学爬虫工具
的操作直接可以分析啦
跟栈后分析加密的地方在这里
整理后js代码如下:
function OOOoOo(_0x240504, _0x8eefdc) {
const _0x3a3671 = _0x240504['split'](''),
_0x1959d4 = _0x8eefdc['split'](''),
_0x582226 = 4;
let _0x5ad857 = [];
for (let _0x2d33d3 = 0; _0x2d33d3 < _0x3a3671['length']; _0x2d33d3 += _0x582226) {
let _0x38ae5f = _0x3a3671['slice'](_0x2d33d3, _0x2d33d3 + _0x582226);
for (let _0x31873b = 0; _0x31873b < _0x38ae5f['length']; _0x31873b++) {
const _0x11057a = _0x38ae5f[_0x31873b]['charCodeAt'](0),
_0x1a6269 = _0x1959d4[_0x31873b % _0x1959d4['length']]['charCodeAt'](0),
_0x25c979 = (_0x11057a + _0x1a6269) % 256;
_0x38ae5f[_0x31873b] = String['fromCharCode'](_0x25c979);
}
_0x5ad857 = _0x5ad857['concat'](_0x38ae5f);
}
const _0x28d8b9 = _0x5ad857['join'](''),
_0x36bdd2 = Array['from'](_0x28d8b9)['map'](_0x3c7e7a => _0x3c7e7a['charCodeAt'](0)['toString'](16)['padStart'](2, "0"))['join']('');
return _0x36bdd2;
}
function encrypt(page) {
const _0x1575b7 ='oooooo';
var _0x1167c7 = new Date()['getTime']();
const m = OOOoOo(_0x1575b7 + _0x1167c7 + page, _0x1575b7);
const t= btoa(_0x1167c7);
return [m,t]
}
python部分没有什么改变的
import requests
import execjs
with open('解密.js', 'r', encoding='utf-8') as f:
content=execjs.compile(f.read())
cookies = {
'xxxxxx':'xxxxxx'
}
sum_data=0
for i in range(1,21):
m,t=content.call('encrypt',i)
headers = {
'm': m,
'origin': 'https://www.mashangpa.com',
'referer': 'https://www.mashangpa.com/problem-detail/8/',
't': t,
}
json_data = {
'page': i,
}
response = requests.post('https://www.mashangpa.com/api/problem-detail/8/data/', cookies=cookies, headers=headers, json=json_data)
sum_data+=sum(response.json()['current_array'])
print(sum_data)
第九题(webpack):
难点分析:
1.无限debugger ,右键(never pause here) 跳过debugger
2..发现post的参数中m和ts加密,ts依旧时间戳
加密的地点在这里
控制台一个一个的补好环境就ok了,这个也不算难
js代码:
const CryptJS=require('crypto-js')
function encrypt() {
const m = CryptJS['HmacSHA1'](("9527" + new Date().getTime()), "xxxooo")['toString']();
const tt = btoa(new Date().getTime());
return [m, tt]
}
python部分:
import requests
import execjs
import time
with open('爬虫求和9.js',mode='r',encoding='utf-8') as f:
content=execjs.compile(f.read())
cookies = {
'sessionid': 'pxgpk5qdwgq3gsu6nxuw9e65i4geffp2',
'Hm_lvt_0d2227abf9548feda3b9cb6fddee26c0': '1755703026,1755703156,1755748408,1755957032',
'HMACCOUNT': 'C7B822A264D2D7D3',
's': '51b351b351b351b370b0f09050301030707130d051',
'Hm_lpvt_0d2227abf9548feda3b9cb6fddee26c0': '1756032322',
}
sum_data=0;
for i in range(1,21):
m, tt = content.call('encrypt')
json_data = {
'page': i,
'm': m,
'tt': tt,
}
response = requests.post('https://www.mashangpa.com/api/problem-detail/9/data/', cookies=cookies, json=json_data)
try:
sum_data += sum(response.json()['current_array'])
except Exception as e:
m, tt = content.call('encrypt')
json_data = {
'page': i,
'm': m,
'tt': tt,
}
response = requests.post('https://www.mashangpa.com/api/problem-detail/9/data/', cookies=cookies,
headers=headers, json=json_data)
sum_data += sum(response.json()['current_array'])
print(response.json())
print(sum_data)
这里有必要解释一下python部分为什么要重复请求,因为时间戳在请求的过程中会变化,可能你这一秒发的请求到服务器的时候就已经过了一秒,也就是参数变化了,所以只能重复请求,直到没有问题
总结:
这三题属于一些网站加密的基本手段和情况了,遇见了一些学校官网,或者登录的网址的加密都可以直接扣下来,或者自己补环境了😊😊😊