最新消息:20210816 当前crifan.com域名已被污染,为防止失联,请关注(页面右下角的)公众号

【记录】用go语言实现模拟登陆百度

GO crifan 10070浏览 0评论

【背景】

之前已经写了教程,分析模拟登陆百度的逻辑:

【教程】手把手教你如何利用工具(IE9的F12)去分析模拟登陆网站(百度首页)的内部逻辑过程

然后又去用不同的语言:

Python的:

【教程】模拟登陆网站 之 Python版(内含两种版本的完整的可运行的代码)

C#的:

【教程】模拟登陆网站 之 C#版(内含两种版本的完整的可运行的代码)

Java的:

【教程】模拟登陆百度之Java代码版

而现在:

对于,算是一无所知的go语言,大概了解到,其也可以有对应的http的库,所以,也打算,

从无到有,一点点,边学习go语言本身,边去实现对应的,模拟登陆百度的功能。

【折腾过程】

1.先去学习一下go语言本身:

【记录】下载和安装go语言

2.然后再去搞懂基本的开发:

【记录】go语言的基本开发:实现Hello World,找到合适的开发环境和工具

3.换了个环境,不过也是x64的win7,然后重新去下载和安装go,然后再去试试普通的hello world。

此处,几点值得一提的:

(1)此处,自动安装完go后,已经把对应的路径:

D:\tmp\dev_install_root\Go\bin

加入到当前的PATH中了;

(2)对应的go/bin下面,有三个工具:

  • go.exe
  • godoc.exe
  • gofmt.exe

4.继续去学习如何写go代码:

【记录】学习如何写go语言代码

5.搞清楚了,如何写go代码,接着就是去,参考官网手册,去学习http方面的代码如何写了。

6.关于go的命名规范,这里有介绍:

Effective Go

7.接着,可以去折腾,如何实现,基本的网页抓取方面的功能了:

【记录】用go实现基本http方面的抓取网页html

8.但是如上获得的内容,都是打印到cmd中的,不方便后续开发记录和查看。

所以希望,能log内容到文件中:

【已解决】go语言中实现输出内容到log文件

9.然后出现文件编码的问题:

【问题】go代码运行出错:# command-line-arguments .\EmulateLoginBaidu.go:86: illegal UTF-8 sequence

10.接着又出现“cannot use body (type []byte) as type string in assignment”的错误:

【已解决】go代码中直接使用http返回的body赋值给string结果出错:cannot use body (type []byte) as type string in assignment

11.至此,已经可以实现了:

将百度主页的html抓取下来,并且输出到log文件中了。

12.接着,继续去,搞懂,如何获得http返回的cookie:

【记录】go语言中处理http的cookie

13.接下来,就是要去从返回的html中提取我要的内容,所以要去折腾:

【记录】go语言中用正则表达式查找某个值

14.接下来,要去搞懂,go语言中的字典类型变量:

【已解决】go语言中的字典类型变量:map

15.再去搞懂,如何获得console的输入:

【已解决】go语言中获得控制台输入的字符串

16.接着再去搞懂,如何发送http的POST:

【记录】go语言中实现http的POST且传递对应的post data

17.实现了POST,且可以传递post data后,可以正常模拟登陆成功了,可以获得对应的cookie了。

所以接下来,再去检测,对应的各个cookie:

注意到,当前此刻返回的httpResp.Header中的Set-Cookie是:

(格式化后)

1
2
3
4
5
6
BDUSS=G1LNG5uLTNYWkU2bzA2SGxCZHZ2Rm5ocnN-MEhFem5uQkZrdkJFVmplUmpBV1ZTQVFBQUFBJCQAAAAAAAAAAAEAAAB-OUgCYWdhaW5pbnB1dAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGN0PVJjdD1SM; expires=Wed, 08-Dec-2021 10:26:43 GMT; path=/; domain=baidu.com; httponly
PTOKEN=deleted; expires=Fri, 21-Sep-2012 10:26:42 GMT; path=/; domain=baidu.com; httponly
PTOKEN=0f1e0187b042630a47c4eea8e0e96a2f; expires=Wed, 08-Dec-2021 10:26:43 GMT; path=/; domain=passport.baidu.com; httponly
STOKEN=8d6ce0cbc7f689a8cd647b8beb5872e3; expires=Wed, 08-Dec-2021 10:26:43 GMT; path=/; domain=passport.baidu.com; httponly
SAVEUSERID=deleted; expires=Fri, 21-Sep-2012 10:26:42 GMT; path=/; domain=passport.baidu.com; httponly
USERNAMETYPE=1; expires=Wed, 08-Dec-2021 10:26:43 GMT; path=/; domain=passport.baidu.com; httponly

可见,对应的cookie:

(1)PTOKEN,对于:

domain=baidu.com

是delete掉了;

而对于passport.baidu.com,PTOKEN还是存在的;

(2)而另外几个cookie:

STOKEN,SAVEUSERID,USERNAMETYPE,的domain却都是:

passport.baidu.com

而不是原以为的:

baidu.com

(3)BDUSS的domain的确是baidu.com

这样的话,之前的代码:

1
gCurCookies = gCurCookieJar.Cookies(httpReq.URL);

以为会只能获得对应的

BDUSS

或者是:

STOKEN,SAVEUSERID,USERNAMETYPE

不过,幸运的是,此处通过:

dbgPrintCurCookies

而打印出来的cookie,是都存在的:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:199) cookieNum=7
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:203) ------ Cookie [0]------
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:204) Name        =H_PS_PSSID
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:205) Value   =3359_1455_2976_2981_3090
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:206) Path        =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:207) Domain  =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:208) Expires =0001-01-01 00:00:00 +0000 UTC
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:209) RawExpires  =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:210) MaxAge  =0
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:211) Secure  =false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:212) HttpOnly    =false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:213) Raw     =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:214) Unparsed    =[]
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:203) ------ Cookie [1]------
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:204) Name        =BAIDUID
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:205) Value   =74F5614706B58BFCCCB3923C8ABD3E61:FG=1
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:206) Path        =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:207) Domain  =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:208) Expires =0001-01-01 00:00:00 +0000 UTC
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:209) RawExpires  =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:210) MaxAge  =0
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:211) Secure  =false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:212) HttpOnly    =false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:213) Raw     =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:214) Unparsed    =[]
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:203) ------ Cookie [2]------
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:204) Name        =HOSUPPORT
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:205) Value   =1
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:206) Path        =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:207) Domain  =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:208) Expires =0001-01-01 00:00:00 +0000 UTC
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:209) RawExpires  =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:210) MaxAge  =0
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:211) Secure  =false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:212) HttpOnly    =false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:213) Raw     =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:214) Unparsed    =[]
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:203) ------ Cookie [3]------
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:204) Name        =BDUSS
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:205) Value   =W95bX41ZTlhNkFKQkpQcGd5Y1ZUOENiYzJ2TkpvakJaZVBXSS10WXh1THVCbVZTQVFBQUFBJCQAAAAAAAAAAAEAAAB-OUgCYWdhaW5pbnB1dAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAO55PVLueT1SO
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:206) Path        =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:207) Domain  =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:208) Expires =0001-01-01 00:00:00 +0000 UTC
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:209) RawExpires  =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:210) MaxAge  =0
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:211) Secure  =false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:212) HttpOnly    =false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:213) Raw     =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:214) Unparsed    =[]
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:203) ------ Cookie [4]------
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:204) Name        =PTOKEN
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:205) Value   =2e67f3d7d5c52118bf4d222ab87ac9a4
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:206) Path        =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:207) Domain  =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:208) Expires =0001-01-01 00:00:00 +0000 UTC
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:209) RawExpires  =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:210) MaxAge  =0
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:211) Secure  =false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:212) HttpOnly    =false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:213) Raw     =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:214) Unparsed    =[]
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:203) ------ Cookie [5]------
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:204) Name        =STOKEN
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:205) Value   =63a3b62efbd83a00c095c624ca4dfdfc
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:206) Path        =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:207) Domain  =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:208) Expires =0001-01-01 00:00:00 +0000 UTC
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:209) RawExpires  =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:210) MaxAge  =0
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:211) Secure  =false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:212) HttpOnly    =false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:213) Raw     =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:214) Unparsed    =[]
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:203) ------ Cookie [6]------
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:204) Name        =USERNAMETYPE
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:205) Value   =1
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:206) Path        =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:207) Domain  =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:208) Expires =0001-01-01 00:00:00 +0000 UTC
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:209) RawExpires  =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:210) MaxAge  =0
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:211) Secure  =false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:212) HttpOnly    =false
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:213) Raw     =
[2013/09/21 18:50:30 ] [INFO] (main.dbgPrintCurCookies:214) Unparsed    =[]

所以,后续可以直接通过cookie的名字,去判断是否存在了。

18.最终,模拟登陆百度成功了。

所用代码为:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
/*
 * [File]
 * EmulateLoginBaidu.go
 *
 * [Function]
 * 【记录】用go语言实现模拟登陆百度
 *
 * [Version]
 * 2013-09-21
 *
 * [Contact]
 */
package main
 
import (
    "fmt"
    //"builtin"
    //"log"
    "os"
    "runtime"
    "path"
    "strings"
    "time"
    //"io"
    "io/ioutil"
    "net/http"
    "net/http/cookiejar"
    "net/url"
    //"sync"
    //"net/url"
    "regexp"
    //"bufio"
    "bytes"
)
 
//import l4g "log4go.googlecode.com/hg"
//import l4g "code.google.com/p/log4go"
import "code.google.com/p/log4go"
 
/***************************************************************************************************
    Global Variables
***************************************************************************************************/
var gCurCookies []*http.Cookie;
var gCurCookieJar *cookiejar.Jar;
var gLogger log4go.Logger;
 
/***************************************************************************************************
    Functions
***************************************************************************************************/
//do init before all others
func initAll(){
    gCurCookies = nil
    //var err error;
    gCurCookieJar,_ = cookiejar.New(nil)
    gLogger = nil
     
    initLogger()
    initCrifanLib()
}
 
//de-init for all
func deinitAll(){
    gCurCookies = nil
    if(nil == gLogger) {
        gLogger.Close();
        //os.Stdout.Sync() //try manually flush, but can not fix log4go's flush bug
         
        gLogger = nil
    }
}
 
//do some init for crifanLib
func initCrifanLib(){
    gLogger.Debug("init for crifanLib")
    gCurCookies = nil
    return
}
 
//init for logger
func initLogger(){
    var filenameOnly string = GetCurFilename()
    var logFilename string =  filenameOnly + ".log";
     
    //gLogger = log4go.NewLogger()
    //gLogger = make(log4go.Logger)
     
    //for console
    //gLogger.AddFilter("stdout", log4go.INFO, log4go.NewConsoleLogWriter())
    gLogger = log4go.NewDefaultLogger(log4go.INFO)
     
    //for log file
    if _, err := os.Stat(logFilename); err == nil {
        //fmt.Printf("found old log file %s, now remove it\n", logFilename)
        os.Remove(logFilename)
    }
    //gLogger.AddFilter("logfile", log4go.FINEST, log4go.NewFileLogWriter(logFilename, true))
    //gLogger.AddFilter("logfile", log4go.FINEST, log4go.NewFileLogWriter(logFilename, false))
    gLogger.AddFilter("log", log4go.FINEST, log4go.NewFileLogWriter(logFilename, false))
    gLogger.Debug("Current time is : %s", time.Now().Format("15:04:05 MST 2006/01/02"))
     
    return
}
 
// GetCurFilename
// Get current file name, without suffix
func GetCurFilename() string {
    _, fulleFilename, _, _ := runtime.Caller(0)
    //fmt.Println(fulleFilename)
    var filenameWithSuffix string
    filenameWithSuffix = path.Base(fulleFilename)
    //fmt.Println("filenameWithSuffix=", filenameWithSuffix)
    var fileSuffix string
    fileSuffix = path.Ext(filenameWithSuffix)
    //fmt.Println("fileSuffix=", fileSuffix)
     
    var filenameOnly string
    filenameOnly = strings.TrimSuffix(filenameWithSuffix, fileSuffix)
    //fmt.Println("filenameOnly=", filenameOnly)
     
    return filenameOnly
}
 
//get url response html
func getUrlRespHtml(strUrl string, postDict map[string]string) string{
    gLogger.Debug("in getUrlRespHtml, strUrl=%s", strUrl)
    gLogger.Debug("postDict=%s", postDict)
     
    var respHtml string = "";
     
    httpClient := &http.Client{
        //Transport:nil,
        //CheckRedirect: nil,
        Jar:gCurCookieJar,
    }
 
    var httpReq *http.Request
    //var newReqErr error
    if nil == postDict {
        gLogger.Debug("is GET")
        //httpReq, newReqErr = http.NewRequest("GET", strUrl, nil)
        httpReq, _ = http.NewRequest("GET", strUrl, nil)
        // ...
        //httpReq.Header.Add("If-None-Match", `W/"wyzzy"`)
    } else {
        //【记录】go语言中实现http的POST且传递对应的post data
        gLogger.Debug("is POST")
        postValues := url.Values{}
        for postKey, PostValue := range postDict{
            postValues.Set(postKey, PostValue)
        }
        gLogger.Debug("postValues=%s", postValues)
        postDataStr := postValues.Encode()
        gLogger.Debug("postDataStr=%s", postDataStr)
        postDataBytes := []byte(postDataStr)
        gLogger.Debug("postDataBytes=%s", postDataBytes)
        postBytesReader := bytes.NewReader(postDataBytes)
        //httpReq, newReqErr = http.NewRequest("POST", strUrl, postBytesReader)
        httpReq, _ = http.NewRequest("POST", strUrl, postBytesReader)
        //httpReq.Header.Set("Content-Type", "application/x-www-form-urlencoded; param=value")
        httpReq.Header.Add("Content-Type", "application/x-www-form-urlencoded")
    }
     
    httpResp, err := httpClient.Do(httpReq)
    // ...
     
    //httpResp, err := http.Get(strUrl)
    //gLogger.Info("http.Get done")
    if err != nil {
        gLogger.Warn("http get strUrl=%s response error=%s\n", strUrl, err.Error())
    }
    gLogger.Debug("httpResp.Header=%s", httpResp.Header)
    gLogger.Debug("httpResp.Status=%s", httpResp.Status)
 
    defer httpResp.Body.Close()
    // gLogger.Info("defer httpResp.Body.Close done")
     
    body, errReadAll := ioutil.ReadAll(httpResp.Body)
    //gLogger.Info("ioutil.ReadAll done")
    if errReadAll != nil {
        gLogger.Warn("get response for strUrl=%s got error=%s\n", strUrl, errReadAll.Error())
    }
    //gLogger.Debug("body=%s\n", body)
 
    //gCurCookies = httpResp.Cookies()
    //gCurCookieJar = httpClient.Jar;
    gCurCookies = gCurCookieJar.Cookies(httpReq.URL);
    //gLogger.Info("httpResp.Cookies done")
     
    //respHtml = "just for test log ok or not"
    respHtml = string(body)
    //gLogger.Info("httpResp body []byte to string done")
 
    return respHtml
}
 
func dbgPrintCurCookies() {
    var cookieNum int = len(gCurCookies);
    gLogger.Debug("cookieNum=%d", cookieNum)
    for i := 0; i < cookieNum; i++ {
        var curCk *http.Cookie = gCurCookies[i];
        //gLogger.Debug("curCk.Raw=%s", curCk.Raw)
        gLogger.Debug("------ Cookie [%d]------", i)
        gLogger.Debug("Name\t\t=%s", curCk.Name)
        gLogger.Debug("Value\t=%s", curCk.Value)
        gLogger.Debug("Path\t\t=%s", curCk.Path)
        gLogger.Debug("Domain\t=%s", curCk.Domain)
        gLogger.Debug("Expires\t=%s", curCk.Expires)
        gLogger.Debug("RawExpires\t=%s", curCk.RawExpires)
        gLogger.Debug("MaxAge\t=%d", curCk.MaxAge)
        gLogger.Debug("Secure\t=%t", curCk.Secure)
        gLogger.Debug("HttpOnly\t=%t", curCk.HttpOnly)
        gLogger.Debug("Raw\t\t=%s", curCk.Raw)
        gLogger.Debug("Unparsed\t=%s", curCk.Unparsed)
    }
}
 
func main() {
    initAll()
 
    gLogger.Info("============ 程序说明 ============");
    gLogger.Info("功能:本程序是用来演示使用Java代码去实现模拟登陆百度");
    gLogger.Info("注意事项:部分百度账户,在登陆时会出现:");
    gLogger.Info("1.部分百度账户,在登陆时会出现:");
    gLogger.Info("系统检测到您的帐号疑似被盗,存在安全风险。请尽快修改密码。");
    gLogger.Info("此时,本程序,无法成功模拟登陆,请自行按照提示去修改密码后,就可以了。");
 
    //step1: access baidu url to get cookie BAIDUID
    gLogger.Info("====== 步骤1:获得BAIDUID的Cookie ======")
    var baiduMainUrl string = "http://www.baidu.com/";
    gLogger.Debug("baiduMainUrl=%s", baiduMainUrl)
    respHtml := getUrlRespHtml(baiduMainUrl, nil)
    gLogger.Debug("respHtml=%s", respHtml)
    dbgPrintCurCookies()
     
    //check cookie
    var bGotCookieBaiduid = false;
    //var cookieNameListToCheck []string = ["BAIDUID"]
    //toCheckCookieNameList := [1]string{"BAIDUID"}
    toCheckCookieNameList := []string{"BAIDUID"}
    toCheckCookieNum := len(toCheckCookieNameList)
    gLogger.Debug("toCheckCookieNum=%d", toCheckCookieNum)
    curCookieNum := len(gCurCookies)
    gLogger.Debug("curCookieNum=%d", curCookieNum)
    for i := 0; i < toCheckCookieNum; i++ {
        toCheckCkName := toCheckCookieNameList[i];
        gLogger.Debug("[%d]toCheckCkName=%s", i, toCheckCkName)
        for j := 0; j < curCookieNum; j++{
            curCookie := gCurCookies[j]
            if(strings.EqualFold(toCheckCkName, curCookie.Name)){
                bGotCookieBaiduid = true;
                break;
            }
        }
    }
 
    if bGotCookieBaiduid {
        gLogger.Info("Found cookie BAIDUID");
    }else{
        gLogger.Info("Not found cookie BAIDUID");
    }
     
    //step2: login, pass paras, extract resp cookie
    gLogger.Info("====== 步骤2:提取login_token ======");
    bExtractTokenValueOK := false
    strLoginToken := ""
    var getApiRespHtml string;
    if bGotCookieBaiduid{
        getApiRespHtml = getUrlRespHtml(getapiUrl, nil);
        gLogger.Debug("getApiRespHtml=%s", getApiRespHtml);
        dbgPrintCurCookies()
         
        //bdPass.api.params.login_token='278623fc5463aa25b0189ddd34165592';
        //use regex to extract login_token
        //【记录】go语言中用正则表达式查找某个值
        loginTokenP, _ := regexp.Compile(`bdPass\.api\.params\.login_token='(?P<loginToken>\w+)';`)
        //loginToken := loginTokenP.FindString(getApiRespHtml);
        //loginToken := loginTokenP.FindSubmatch(getApiRespHtml);
        foundLoginToken := loginTokenP.FindStringSubmatch(getApiRespHtml);
        gLogger.Debug("foundLoginToken=%s", foundLoginToken);
        if nil != foundLoginToken {
            strLoginToken = foundLoginToken[1] //tmp go regexp not support named group, so use index here
            gLogger.Info("found bdPass.api.params.login_token=%s", strLoginToken);
            bExtractTokenValueOK = true;
        } else {
            gLogger.Warn(" not found login_token from html=%s", getApiRespHtml);
        }
    }
 
    //step3: verify returned cookies
    bLoginBaiduOk := false;
    if bGotCookieBaiduid && bExtractTokenValueOK {
        gLogger.Info("======步骤3:登陆百度并检验返回的Cookie ======");
        staticPageUrl := "http://www.baidu.com/cache/user/html/jump.html";
         
        postDict := map[string]string{}
        //postDict["ppui_logintime"] = ""
        postDict["charset"] = "utf-8"
        //postDict["codestring"] = ""
        postDict["token"] = strLoginToken
        postDict["isPhone"] = "false"
        postDict["index"] = "0"
        //postDict["u"] = ""
        //postDict["safeflg"] = "0"
        postDict["staticpage"] = staticPageUrl
        postDict["loginType"] = "1"
        postDict["tpl"] = "mn"
        postDict["callback"] = "parent.bdPass.api.login._postCallback"
 
        //【已解决】go语言中获得控制台输入的字符串
        strBaiduUsername := ""
        strBaiduPassword := ""
        gLogger.Info("Plese input:")
        gLogger.Info("Baidu Username:")
        _, err1 := fmt.Scanln(&strBaiduUsername)
        if nil == err1 {
            gLogger.Debug("strBaiduUsername=%s", strBaiduUsername)
        }
        gLogger.Info("Baidu Password:")
        _, err2 := fmt.Scanln(&strBaiduPassword)
        if nil == err2 {
            gLogger.Debug("strBaiduPassword=%s", strBaiduPassword)
        }
         
        postDict["username"] = strBaiduUsername
        postDict["password"] = strBaiduPassword
        postDict["verifycode"] = ""
        postDict["mem_pass"] = "on"
         
        gLogger.Debug("postDict=%s", postDict)
         
        baiduMainLoginUrl := "https://passport.baidu.com/v2/api/?login";
        loginBaiduRespHtml := getUrlRespHtml(baiduMainLoginUrl, postDict);
        gLogger.Debug("loginBaiduRespHtml=%s", loginBaiduRespHtml)
        dbgPrintCurCookies();
         
        //check resp cookies exist or not
        cookieNameDict := map[string]bool{
            "BDUSS"     : false,
            "PTOKEN"    : false,
            "STOKEN"    : false,
            //"SAVEUSERID": false, //be deleted
        }
         
        for cookieName, _ := range cookieNameDict {
            for _, singleCookie := range gCurCookies {
                //if(strings.EqualFold(cookieName, singleCookie.Name)){
                if cookieName == singleCookie.Name {
                    cookieNameDict[cookieName] = true;
                    gLogger.Debug("Found cookie %s", cookieName)
                }
            }
        }
        gLogger.Debug("After check resp cookie, cookieNameDict=%s", cookieNameDict)
         
        bAllCookiesFound := true
        for _, bIsExist := range cookieNameDict {
            bAllCookiesFound = bAllCookiesFound && bIsExist
        }
        bLoginBaiduOk = bAllCookiesFound
        if (bLoginBaiduOk) {
            gLogger.Info("成功模拟登陆百度首页!" );
        } else{
            gLogger.Info("模拟登陆百度首页 失败!");
            gLogger.Info("所返回的HTML源码为:" + loginBaiduRespHtml);
        }
    }
     
    deinitAll()
 
    //【workaround】go语言中用log4go输出信息时有bug:只输出部分信息,甚至是无任何输出
    time.Sleep(100 * time.Millisecond)
}

效果为:

emulate baidu login ok in the end

 

【总结】

从无到有,经历千辛万苦,最终终于用go语言,实现了,模拟登陆百度。


后续的,抽空再继续优化,至少包括:

【记录】在用go语言成功模拟登陆百度后把相关函数整理至自己的go语言的库函数:crifanLib.go

转载请注明:在路上 » 【记录】用go语言实现模拟登陆百度

发表我的评论
取消评论

表情

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址

网友最新评论 (3)

  1. 支持
    yqj11年前 (2014-01-27)回复
  2. 楼主又开始折腾go语言去了?
    baicai12年前 (2013-09-19)回复
87 queries in 0.225 seconds, using 22.33MB memory