那些年关于技术的未解之谜

由于技术能力的限制,平时会遇到一些自己觉得非常诡异的问题,感觉到莫名其妙。其实到头来发现,归根结底还是自己的认知问题:可能是技术水平不够,或者考虑不周全,甚至是一些低级别的错误判断。总而言之,遇到这些问题后,有时候请教人、查资料之后仍旧不得解,只能先记录下来,留做备注说明,等待以后解决。当然,随着时间的流逝,有些问题可能就被忘记了,有些问题在之后的某一个时间点被解决了。本文就是要记录这些问题,并在遇到新问题或者解决老问题之后,保持更新。

常用链接

在这里先列出一些常用的网站链接,方便查看:

es-spark 读取 es 数据后 count 报错

使用 es-hadoop 组件,起 Spark 任务去查询 es 数据,然后过滤,过滤后做一个 count 算子,结果就报错了。而且,在报错后又重试了很多次(5 次以上),一直正常,没法重现问题。这个任务需要经常跑,以前从来没遇到过这样的异常,初步怀疑是 es 集群不稳定,具体原因不得而知。

错误截图:
报错信息截图

完整错误信息如下(重要包名称被替换):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
2019-02-26_15:01:44 [main] ERROR spokesman3.SpokesAndBrand:510: !!!!Spark 出错: org.codehaus.jackson.JsonParseException: Unexpected end-of-input in field name
at [Source: org.apache.commons.httpclient.AutoCloseInputStream@2687cf14; line: 1, column: 17581]
org.elasticsearch.hadoop.rest.EsHadoopParsingException: org.codehaus.jackson.JsonParseException: Unexpected end-of-input in field name
at [Source: org.apache.commons.httpclient.AutoCloseInputStream@2687cf14; line: 1, column: 17581]
at org.elasticsearch.hadoop.rest.RestClient.parseContent (RestClient.java:171)
at org.elasticsearch.hadoop.rest.RestClient.get (RestClient.java:155)
at org.elasticsearch.hadoop.rest.RestClient.targetShards (RestClient.java:357)
at org.elasticsearch.hadoop.rest.RestRepository.doGetReadTargetShards (RestRepository.java:306)
at org.elasticsearch.hadoop.rest.RestRepository.getReadTargetShards (RestRepository.java:297)
at org.elasticsearch.hadoop.rest.RestService.findPartitions (RestService.java:241)
at org.elasticsearch.spark.rdd.AbstractEsRDD.esPartitions$lzycompute (AbstractEsRDD.scala:73)
at org.elasticsearch.spark.rdd.AbstractEsRDD.esPartitions (AbstractEsRDD.scala:72)
at org.elasticsearch.spark.rdd.AbstractEsRDD.getPartitions (AbstractEsRDD.scala:44)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply (RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply (RDD.scala:237)
at scala.Option.getOrElse (Option.scala:120)
at org.apache.spark.rdd.RDD.partitions (RDD.scala:237)
at org.apache.spark.SparkContext.runJob (SparkContext.scala:1929)
at org.apache.spark.rdd.RDD.count (RDD.scala:1157)
at org.apache.spark.api.java.JavaRDDLike$class.count (JavaRDDLike.scala:440)
at org.apache.spark.api.java.AbstractJavaRDDLike.count (JavaRDDLike.scala:46)
at com.package.to.class.SpokesAndBrand.getMention (SpokesAndBrand.java:508)
at com.package.to.class.SpokesAndBrand.runCelebrityByBrand (SpokesAndBrand.java:185)
at com.package.to.class.SpokesAndBrand.execute (SpokesAndBrand.java:116)
at com.package.to.class.SpokesmanAnalyzer.execute (SpokesmanAnalyzer.java:162)
at com.package.to.class.SpokesmanAnalyzeCli.execute (SpokesmanAnalyzeCli.java:154)
at com.package.to.class.SpokesmanAnalyzeCli.start (SpokesmanAnalyzeCli.java:75)
at com.package.to.class.util.AdvCli.initRunner (AdvCli.java:191)
at com.package.to.class.job.client.BasicInputOutputSystemWorker.run (BasicInputOutputSystemWorker.java:79)
at com.package.to.class.model.AbstractDataReportWorker.run (AbstractDataReportWorker.java:122)
at com.package.to.class.buffalo.job.AbstractBUTaskWorker.runTask (AbstractBUTaskWorker.java:63)
at com.package.to.class.report.cli.TaskLocalRunnerCli.start (TaskLocalRunnerCli.java:110)
at com.package.to.class.util.AdvCli.initRunner (AdvCli.java:191)
at com.package.to.class.report.cli.TaskLocalRunnerCli.main (TaskLocalRunnerCli.java:43)
Caused by: org.codehaus.jackson.JsonParseException: Unexpected end-of-input in field name
at [Source: org.apache.commons.httpclient.AutoCloseInputStream@2687cf14; line: 1, column: 17581]
at org.codehaus.jackson.JsonParser._constructError (JsonParser.java:1433)
at org.codehaus.jackson.impl.JsonParserMinimalBase._reportError (JsonParserMinimalBase.java:521)
at org.codehaus.jackson.impl.JsonParserMinimalBase._reportInvalidEOF (JsonParserMinimalBase.java:454)
at org.codehaus.jackson.impl.Utf8StreamParser.parseEscapedFieldName (Utf8StreamParser.java:1503)
at org.codehaus.jackson.impl.Utf8StreamParser.slowParseFieldName (Utf8StreamParser.java:1404)
at org.codehaus.jackson.impl.Utf8StreamParser._parseFieldName (Utf8StreamParser.java:1231)
at org.codehaus.jackson.impl.Utf8StreamParser.nextToken (Utf8StreamParser.java:495)
at org.codehaus.jackson.map.deser.std.UntypedObjectDeserializer.mapObject (UntypedObjectDeserializer.java:219)
at org.codehaus.jackson.map.deser.std.UntypedObjectDeserializer.deserialize (UntypedObjectDeserializer.java:47)
at org.codehaus.jackson.map.deser.std.UntypedObjectDeserializer.mapArray (UntypedObjectDeserializer.java:165)
at org.codehaus.jackson.map.deser.std.UntypedObjectDeserializer.deserialize (UntypedObjectDeserializer.java:51)
at org.codehaus.jackson.map.deser.std.UntypedObjectDeserializer.mapArray (UntypedObjectDeserializer.java:165)
at org.codehaus.jackson.map.deser.std.UntypedObjectDeserializer.deserialize (UntypedObjectDeserializer.java:51)
at org.codehaus.jackson.map.deser.std.MapDeserializer._readAndBind (MapDeserializer.java:319)
at org.codehaus.jackson.map.deser.std.MapDeserializer.deserialize (MapDeserializer.java:249)
at org.codehaus.jackson.map.deser.std.MapDeserializer.deserialize (MapDeserializer.java:33)
at org.codehaus.jackson.map.ObjectMapper._readValue (ObjectMapper.java:2704)
at org.codehaus.jackson.map.ObjectMapper.readValue (ObjectMapper.java:1286)
at org.elasticsearch.hadoop.rest.RestClient.parseContent (RestClient.java:166)
... 29 more
2019-02-26_15:01:44 [main] INFO rdd.JavaEsRDD:58: Removing RDD 3086 from persistence list

Hexo 生成 html 静态页面目录锚点失效

我这些所有的博客文档是先写成 Markdown 文件,然后使用 Hexo 渲染生成 html 静态页面,再发布到 GitHub Pages 上面,还有一些是发布到我自己的 VPS 上面(为了百度爬虫)。

但是最近我发现一个现象,有一些文章的锚点无效,也就是表现为目录无法跳转,例如想直接查看某一级目录的内容,在右侧的 文章目录 中直接点击对应的标题,不会自动跳转过去。这个问题我发现了很久,但是一直没在意,也没有找到原因。最近才碰巧发现是因为标题内容里面有空格,这才导致生成的 html 静态页面里面的锚点失效,我随机又测试了几次其它的页面,看起来的确是这样。下面列出一些示例:

1
2
3
https://www.playpi.org/2019022501.html ,Hexo 踩坑记录的 
https://www.playpi.org/2018121901.html ,js 字符串分割方法
https://www.playpi.org/2019020701.html ,itchat 0 - 初识

但是,我又发现其他人的博客,目录标题内容中也有空格,却可以正常跳转,我很疑惑。现在我猜测是 Hexo 的问题,或者哪里需要配置,等待以后的解决方法吧。别人的博客示例:https://blog.itnote.me/Hexo/hexo-chinese-english-space/

邮件依赖的诡异异常

在项目中新引入了邮件相关的依赖【没有其它任何变化】,这样就可以在需要时发送通知邮件,依赖内容如下:

1
2
3
4
5
6
<!-- 邮件相关依赖 -->
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-email</artifactId>
<version>1.3.3</version>
</dependency>

然后神奇的事情发生了,实际执行时,程序抛出异常【去掉这个依赖则正常】:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Exception in thread "main" java.lang.StackOverflowError
at sun.nio.cs.UTF_8$Encoder.encodeLoop (UTF_8.java:619)
at java.nio.charset.CharsetEncoder.encode (CharsetEncoder.java:561)
at sun.nio.cs.StreamEncoder.implWrite (StreamEncoder.java:271)
at sun.nio.cs.StreamEncoder.write (StreamEncoder.java:125)
at java.io.OutputStreamWriter.write (OutputStreamWriter.java:207)
at java.io.BufferedWriter.flushBuffer (BufferedWriter.java:129)
at java.io.PrintStream.write (PrintStream.java:526)
at java.io.PrintStream.print (PrintStream.java:669)
at java.io.PrintStream.println (PrintStream.java:806)
at org.slf4j.impl.SimpleLogger.write (SimpleLogger.java:381)
at org.slf4j.impl.SimpleLogger.log (SimpleLogger.java:376)
at org.slf4j.impl.SimpleLogger.info (SimpleLogger.java:538)
at org.apache.maven.cli.logging.Slf4jLogger.info (Slf4jLogger.java:59)
at org.codehaus.plexus.archiver.AbstractArchiver$1.hasNext (AbstractArchiver.java:464)
at org.codehaus.plexus.archiver.AbstractArchiver$1.hasNext (AbstractArchiver.java:467)
at org.codehaus.plexus.archiver.AbstractArchiver$1.hasNext (AbstractArchiver.java:467)
at org.codehaus.plexus.archiver.AbstractArchiver$1.hasNext (AbstractArchiver.java:467)

而根据这个异常信息,我搜索不到任何有效的信息,一直无法解决。最后,我对比了其它项目的配置,发现 手动设置 maven-assembly-plugin 插件的版本为2.6即可。而之前是没有设置这个版本号的,默认去仓库获取的最新版本,这个默认的版本可能刚好有问题。

Python 入门踩坑

在一开始使用 Python 的时候,没有使用类似 Anaconda、Winpython 这种套件来帮我自动管理 Python 的第三方工具库,而是从 Python 安装开始,用到什么再用 pip 安装什么。整个过程真的可以把人搞崩溃,工具库之间的传递依赖、版本的不兼容等问题,令人望而却步,下面给出一些难忘的经历。

出现错误:

1
2
Install packages failed: Installing packages: error occurred
numpy.distutils.system_info.NotFoundError: no lapack/blas resources found

需要先手动安装 numpy+mkl,再手动安装 scipy,下载文件链接:http://www.lfd.uci.edu/~gohlke/pythonlibs 。我下载了 2 个文件:numpy-1.11.3+mkl-cp27-cp27m-win32.whl、scipy-0.19.0-cp27-cp27m-win32.whl,然后手动安装。

一开始我下载的是 64 位的安装包,结果发现我的 Windows 安装的 Python 是 32 位的,导致不支持【下载时没有选择位数,直接下载的默认的包】。另外,直接进入 Python 的命令行环境时也会打印出版本信息的。使用 import pip; print (pip.pep425tags.get_supported ()); 可以获取到 pip 支持的文件名和版本。

注意安装 scipy 之前还需要各种第三方库,官方介绍:Install numpy+mkl before installing scipy.。在 Shell 中验证安装第三方库是否成功,例如 numpy:from numpy import *。

scipy 包安装:pip install scipy==0.16.1【不推荐】,成功完成安装,如果缺少第三方包会报很多错误。网上查询后的总结:安装 numpy 后安装 scipy 失败,报错:numpy.distutils.system_info.NotFoundError,一般是缺少一些系统库,需要安装:libopenblas-dev、liblapack-dev、libatlas-dev、libblas-dev。

常见第三方库介绍:

  • pandas,分析数据
  • sklearn,机器学习,各种算法
  • jieba,分词工具
  • gensim nlp word2v,模块训练词向量模型
  • scipy,算法库,数学工具包
  • numpy,数据分析
  • matlptop,图形可视化

Python 中的编码:
2.X 版本,python 编码过程: 输入 –> str –> decode –> unicode –> encode –> str –> 输出。
3.X 版本,不一样,直接是 unicode。

Python 中代码有 print u’xx’ + yy,yy 是中文,直接跑的时候打印到 Shell 不报错,但是使用后台挂起跑的时候,重定向到文件时,会报错,因为 Python 获取不到输出流的编码。

Spark UI 无法显示

使用 yarn-client 模式起了一个 Spark 任务,在 Driver 端看到异常日志:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
2019-01-16_14:53:31 [qtp192486017-1829 - /static/timeline-view.js] WARN servlet.DefaultServlet:587: EXCEPTION 
java.lang.IllegalArgumentException: MALFORMED
at java.util.zip.ZipCoder.toString (ZipCoder.java:58)
at java.util.zip.ZipFile.getZipEntry (ZipFile.java:583)
at java.util.zip.ZipFile.access$900 (ZipFile.java:60)
at java.util.zip.ZipFile$ZipEntryIterator.next (ZipFile.java:539)
at java.util.zip.ZipFile$ZipEntryIterator.nextElement (ZipFile.java:514)
at java.util.zip.ZipFile$ZipEntryIterator.nextElement (ZipFile.java:495)
at java.util.jar.JarFile$JarEntryIterator.next (JarFile.java:257)
at java.util.jar.JarFile$JarEntryIterator.nextElement (JarFile.java:266)
at java.util.jar.JarFile$JarEntryIterator.nextElement (JarFile.java:247)
at org.spark-project.jetty.util.resource.JarFileResource.exists (JarFileResource.java:189)
at org.spark-project.jetty.servlet.DefaultServlet.getResource (DefaultServlet.java:398)
at org.spark-project.jetty.servlet.DefaultServlet.doGet (DefaultServlet.java:476)
at javax.servlet.http.HttpServlet.service (HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service (HttpServlet.java:820)
at org.spark-project.jetty.servlet.ServletHolder.handle (ServletHolder.java:684)
at org.spark-project.jetty.servlet.ServletHandler$CachedChain.doFilter (ServletHandler.java:1507)
at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter (AmIpFilter.java:164)
at org.spark-project.jetty.servlet.ServletHandler$CachedChain.doFilter (ServletHandler.java:1478)
at org.spark-project.jetty.servlet.ServletHandler.doHandle (ServletHandler.java:499)
at org.spark-project.jetty.server.handler.ContextHandler.doHandle (ContextHandler.java:1086)
at org.spark-project.jetty.servlet.ServletHandler.doScope (ServletHandler.java:427)
at org.spark-project.jetty.server.handler.ContextHandler.doScope (ContextHandler.java:1020)
at org.spark-project.jetty.server.handler.ScopedHandler.handle (ScopedHandler.java:135)
at org.spark-project.jetty.server.handler.GzipHandler.handle (GzipHandler.java:264)
at org.spark-project.jetty.server.handler.ContextHandlerCollection.handle (ContextHandlerCollection.java:255)
at org.spark-project.jetty.server.handler.HandlerWrapper.handle (HandlerWrapper.java:116)
at org.spark-project.jetty.server.Server.handle (Server.java:366)
at org.spark-project.jetty.server.AbstractHttpConnection.handleRequest (AbstractHttpConnection.java:494)
at org.spark-project.jetty.server.AbstractHttpConnection.headerComplete (AbstractHttpConnection.java:973)
at org.spark-project.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete (AbstractHttpConnection.java:1035)
at org.spark-project.jetty.http.HttpParser.parseNext (HttpParser.java:641)
at org.spark-project.jetty.http.HttpParser.parseAvailable (HttpParser.java:231)
at org.spark-project.jetty.server.AsyncHttpConnection.handle (AsyncHttpConnection.java:82)
at org.spark-project.jetty.io.nio.SelectChannelEndPoint.handle (SelectChannelEndPoint.java:696)
at org.spark-project.jetty.io.nio.SelectChannelEndPoint$1.run (SelectChannelEndPoint.java:53)
at org.spark-project.jetty.util.thread.QueuedThreadPool.runJob (QueuedThreadPool.java:608)
at org.spark-project.jetty.util.thread.QueuedThreadPool$3.run (QueuedThreadPool.java:543)
at java.lang.Thread.run (Thread.java:748)
2019-01-16_14:53:31 [qtp192486017-1829 - /static/timeline-view.js] WARN servlet.ServletHandler:592: Error for /static/timeline-view.js
java.lang.NoClassDefFoundError: org/spark-project/jetty/server/handler/ErrorHandler$ErrorPageMapper
at org.spark-project.jetty.server.handler.ErrorHandler.handle (ErrorHandler.java:71)
at org.spark-project.jetty.server.Response.sendError (Response.java:349)
at javax.servlet.http.HttpServletResponseWrapper.sendError (HttpServletResponseWrapper.java:118)
at org.spark-project.jetty.http.gzip.CompressedResponseWrapper.sendError (CompressedResponseWrapper.java:291)
at org.spark-project.jetty.servlet.DefaultServlet.doGet (DefaultServlet.java:589)
at javax.servlet.http.HttpServlet.service (HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service (HttpServlet.java:820)
at org.spark-project.jetty.servlet.ServletHolder.handle (ServletHolder.java:684)
at org.spark-project.jetty.servlet.ServletHandler$CachedChain.doFilter (ServletHandler.java:1507)
at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter (AmIpFilter.java:164)
at org.spark-project.jetty.servlet.ServletHandler$CachedChain.doFilter (ServletHandler.java:1478)
at org.spark-project.jetty.servlet.ServletHandler.doHandle (ServletHandler.java:499)
at org.spark-project.jetty.server.handler.ContextHandler.doHandle (ContextHandler.java:1086)
at org.spark-project.jetty.servlet.ServletHandler.doScope (ServletHandler.java:427)
at org.spark-project.jetty.server.handler.ContextHandler.doScope (ContextHandler.java:1020)
at org.spark-project.jetty.server.handler.ScopedHandler.handle (ScopedHandler.java:135)
at org.spark-project.jetty.server.handler.GzipHandler.handle (GzipHandler.java:264)
at org.spark-project.jetty.server.handler.ContextHandlerCollection.handle (ContextHandlerCollection.java:255)
at org.spark-project.jetty.server.handler.HandlerWrapper.handle (HandlerWrapper.java:116)
at org.spark-project.jetty.server.Server.handle (Server.java:366)
at org.spark-project.jetty.server.AbstractHttpConnection.handleRequest (AbstractHttpConnection.java:494)
at org.spark-project.jetty.server.AbstractHttpConnection.headerComplete (AbstractHttpConnection.java:973)
at org.spark-project.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete (AbstractHttpConnection.java:1035)
at org.spark-project.jetty.http.HttpParser.parseNext (HttpParser.java:641)
at org.spark-project.jetty.http.HttpParser.parseAvailable (HttpParser.java:231)
at org.spark-project.jetty.server.AsyncHttpConnection.handle (AsyncHttpConnection.java:82)
at org.spark-project.jetty.io.nio.SelectChannelEndPoint.handle (SelectChannelEndPoint.java:696)
at org.spark-project.jetty.io.nio.SelectChannelEndPoint$1.run (SelectChannelEndPoint.java:53)
at org.spark-project.jetty.util.thread.QueuedThreadPool.runJob (QueuedThreadPool.java:608)
at org.spark-project.jetty.util.thread.QueuedThreadPool$3.run (QueuedThreadPool.java:543)
at java.lang.Thread.run (Thread.java:748)
2019-01-16_14:53:31 [qtp192486017-1829 - /static/timeline-view.js] WARN server.AbstractHttpConnection:552: /static/timeline-view.js
java.lang.NoSuchMethodError: javax.servlet.http.HttpServletRequest.isAsyncStarted () Z
at org.spark-project.jetty.servlet.ServletHandler.doHandle (ServletHandler.java:608)
at org.spark-project.jetty.server.handler.ContextHandler.doHandle (ContextHandler.java:1086)
at org.spark-project.jetty.servlet.ServletHandler.doScope (ServletHandler.java:427)
at org.spark-project.jetty.server.handler.ContextHandler.doScope (ContextHandler.java:1020)
at org.spark-project.jetty.server.handler.ScopedHandler.handle (ScopedHandler.java:135)
at org.spark-project.jetty.server.handler.GzipHandler.handle (GzipHandler.java:264)
at org.spark-project.jetty.server.handler.ContextHandlerCollection.handle (ContextHandlerCollection.java:255)
at org.spark-project.jetty.server.handler.HandlerWrapper.handle (HandlerWrapper.java:116)
at org.spark-project.jetty.server.Server.handle (Server.java:366)
at org.spark-project.jetty.server.AbstractHttpConnection.handleRequest (AbstractHttpConnection.java:494)
at org.spark-project.jetty.server.AbstractHttpConnection.headerComplete (AbstractHttpConnection.java:973)
at org.spark-project.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete (AbstractHttpConnection.java:1035)
at org.spark-project.jetty.http.HttpParser.parseNext (HttpParser.java:641)
at org.spark-project.jetty.http.HttpParser.parseAvailable (HttpParser.java:231)
at org.spark-project.jetty.server.AsyncHttpConnection.handle (AsyncHttpConnection.java:82)
at org.spark-project.jetty.io.nio.SelectChannelEndPoint.handle (SelectChannelEndPoint.java:696)
at org.spark-project.jetty.io.nio.SelectChannelEndPoint$1.run (SelectChannelEndPoint.java:53)
at org.spark-project.jetty.util.thread.QueuedThreadPool.runJob (QueuedThreadPool.java:608)
at org.spark-project.jetty.util.thread.QueuedThreadPool$3.run (QueuedThreadPool.java:543)
at java.lang.Thread.run (Thread.java:748)

这个日志在反复打印,也就是在任务的运行过程中,一直都有这个错误。它引发了什么问题呢,我检查了一下,对 Spark 任务的实际功能并没有影响,任务跑完后功能正常实现。但是,我发现在任务的运行过程中,Spark UI 页面打开后不正常显示【异常信息的开头就是关于某个 js 文件问题】:
SparkUI 不正常显示

点击进去,直接显示 Error 500:
点击进去
Error500

服务器的 Driver 端日志截图:
日志截图 1
日志截图 1

日志截图 2
日志截图 2

日志截图 3
日志截图 3

等了几天,又遇到同样的问题,除了这 2 次,其它时间点就没遇到过了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
2019-01-24_22:51:49 [qtp697001207-1591 - /static/spark-dag-viz.js] WARN servlet.DefaultServlet:587: EXCEPTION 
java.lang.IllegalArgumentException: MALFORMED
at java.util.zip.ZipCoder.toString (ZipCoder.java:58)
at java.util.zip.ZipFile.getZipEntry (ZipFile.java:583)
at java.util.zip.ZipFile.access$900 (ZipFile.java:60)
at java.util.zip.ZipFile$ZipEntryIterator.next (ZipFile.java:539)
at java.util.zip.ZipFile$ZipEntryIterator.nextElement (ZipFile.java:514)
at java.util.zip.ZipFile$ZipEntryIterator.nextElement (ZipFile.java:495)
at java.util.jar.JarFile$JarEntryIterator.next (JarFile.java:257)
at java.util.jar.JarFile$JarEntryIterator.nextElement (JarFile.java:266)
at java.util.jar.JarFile$JarEntryIterator.nextElement (JarFile.java:247)
at org.spark-project.jetty.util.resource.JarFileResource.exists (JarFileResource.java:189)
at org.spark-project.jetty.servlet.DefaultServlet.getResource (DefaultServlet.java:398)
at org.spark-project.jetty.servlet.DefaultServlet.doGet (DefaultServlet.java:476)
at javax.servlet.http.HttpServlet.service (HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service (HttpServlet.java:820)
at org.spark-project.jetty.servlet.ServletHolder.handle (ServletHolder.java:684)
at org.spark-project.jetty.servlet.ServletHandler$CachedChain.doFilter (ServletHandler.java:1507)
at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter (AmIpFilter.java:164)
at org.spark-project.jetty.servlet.ServletHandler$CachedChain.doFilter (ServletHandler.java:1478)
at org.spark-project.jetty.servlet.ServletHandler.doHandle (ServletHandler.java:499)
at org.spark-project.jetty.server.handler.ContextHandler.doHandle (ContextHandler.java:1086)
at org.spark-project.jetty.servlet.ServletHandler.doScope (ServletHandler.java:427)
at org.spark-project.jetty.server.handler.ContextHandler.doScope (ContextHandler.java:1020)
at org.spark-project.jetty.server.handler.ScopedHandler.handle (ScopedHandler.java:135)
at org.spark-project.jetty.server.handler.GzipHandler.handle (GzipHandler.java:264)
at org.spark-project.jetty.server.handler.ContextHandlerCollection.handle (ContextHandlerCollection.java:255)
at org.spark-project.jetty.server.handler.HandlerWrapper.handle (HandlerWrapper.java:116)
at org.spark-project.jetty.server.Server.handle (Server.java:366)
at org.spark-project.jetty.server.AbstractHttpConnection.handleRequest (AbstractHttpConnection.java:494)
at org.spark-project.jetty.server.AbstractHttpConnection.headerComplete (AbstractHttpConnection.java:973)
at org.spark-project.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete (AbstractHttpConnection.java:1035)
at org.spark-project.jetty.http.HttpParser.parseNext (HttpParser.java:641)
at org.spark-project.jetty.http.HttpParser.parseAvailable (HttpParser.java:231)
at org.spark-project.jetty.server.AsyncHttpConnection.handle (AsyncHttpConnection.java:82)
at org.spark-project.jetty.io.nio.SelectChannelEndPoint.handle (SelectChannelEndPoint.java:696)
at org.spark-project.jetty.io.nio.SelectChannelEndPoint$1.run (SelectChannelEndPoint.java:53)
at org.spark-project.jetty.util.thread.QueuedThreadPool.runJob (QueuedThreadPool.java:608)
at org.spark-project.jetty.util.thread.QueuedThreadPool$3.run (QueuedThreadPool.java:543)
at java.lang.Thread.run (Thread.java:748)
2019-01-24_22:51:49 [qtp697001207-1591 - /static/spark-dag-viz.js] WARN servlet.ServletHandler:592: Error for /static/spark-dag-viz.js
java.lang.NoClassDefFoundError: org/spark-project/jetty/server/handler/ErrorHandler$ErrorPageMapper
at org.spark-project.jetty.server.handler.ErrorHandler.handle (ErrorHandler.java:71)
at org.spark-project.jetty.server.Response.sendError (Response.java:349)
at javax.servlet.http.HttpServletResponseWrapper.sendError (HttpServletResponseWrapper.java:118)
at org.spark-project.jetty.http.gzip.CompressedResponseWrapper.sendError (CompressedResponseWrapper.java:291)
at org.spark-project.jetty.servlet.DefaultServlet.doGet (DefaultServlet.java:589)
at javax.servlet.http.HttpServlet.service (HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service (HttpServlet.java:820)
at org.spark-project.jetty.servlet.ServletHolder.handle (ServletHolder.java:684)
at org.spark-project.jetty.servlet.ServletHandler$CachedChain.doFilter (ServletHandler.java:1507)
at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter (AmIpFilter.java:164)
at org.spark-project.jetty.servlet.ServletHandler$CachedChain.doFilter (ServletHandler.java:1478)
at org.spark-project.jetty.servlet.ServletHandler.doHandle (ServletHandler.java:499)
at org.spark-project.jetty.server.handler.ContextHandler.doHandle (ContextHandler.java:1086)
at org.spark-project.jetty.servlet.ServletHandler.doScope (ServletHandler.java:427)
at org.spark-project.jetty.server.handler.ContextHandler.doScope (ContextHandler.java:1020)
at org.spark-project.jetty.server.handler.ScopedHandler.handle (ScopedHandler.java:135)
at org.spark-project.jetty.server.handler.GzipHandler.handle (GzipHandler.java:264)
at org.spark-project.jetty.server.handler.ContextHandlerCollection.handle (ContextHandlerCollection.java:255)
at org.spark-project.jetty.server.handler.HandlerWrapper.handle (HandlerWrapper.java:116)
at org.spark-project.jetty.server.Server.handle (Server.java:366)
at org.spark-project.jetty.server.AbstractHttpConnection.handleRequest (AbstractHttpConnection.java:494)
at org.spark-project.jetty.server.AbstractHttpConnection.headerComplete (AbstractHttpConnection.java:973)
at org.spark-project.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete (AbstractHttpConnection.java:1035)
at org.spark-project.jetty.http.HttpParser.parseNext (HttpParser.java:641)
at org.spark-project.jetty.http.HttpParser.parseAvailable (HttpParser.java:231)
at org.spark-project.jetty.server.AsyncHttpConnection.handle (AsyncHttpConnection.java:82)
at org.spark-project.jetty.io.nio.SelectChannelEndPoint.handle (SelectChannelEndPoint.java:696)
at org.spark-project.jetty.io.nio.SelectChannelEndPoint$1.run (SelectChannelEndPoint.java:53)
at org.spark-project.jetty.util.thread.QueuedThreadPool.runJob (QueuedThreadPool.java:608)
at org.spark-project.jetty.util.thread.QueuedThreadPool$3.run (QueuedThreadPool.java:543)
at java.lang.Thread.run (Thread.java:748)
2019-01-24_22:51:49 [qtp697001207-1591 - /static/spark-dag-viz.js] WARN server.AbstractHttpConnection:552: /static/spark-dag-viz.js
java.lang.NoSuchMethodError: javax.servlet.http.HttpServletRequest.isAsyncStarted () Z
at org.spark-project.jetty.servlet.ServletHandler.doHandle (ServletHandler.java:608)
at org.spark-project.jetty.server.handler.ContextHandler.doHandle (ContextHandler.java:1086)
at org.spark-project.jetty.servlet.ServletHandler.doScope (ServletHandler.java:427)
at org.spark-project.jetty.server.handler.ContextHandler.doScope (ContextHandler.java:1020)
at org.spark-project.jetty.server.handler.ScopedHandler.handle (ScopedHandler.java:135)
at org.spark-project.jetty.server.handler.GzipHandler.handle (GzipHandler.java:264)
at org.spark-project.jetty.server.handler.ContextHandlerCollection.handle (ContextHandlerCollection.java:255)
at org.spark-project.jetty.server.handler.HandlerWrapper.handle (HandlerWrapper.java:116)
at org.spark-project.jetty.server.Server.handle (Server.java:366)
at org.spark-project.jetty.server.AbstractHttpConnection.handleRequest (AbstractHttpConnection.java:494)
at org.spark-project.jetty.server.AbstractHttpConnection.headerComplete (AbstractHttpConnection.java:973)
at org.spark-project.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete (AbstractHttpConnection.java:1035)
at org.spark-project.jetty.http.HttpParser.parseNext (HttpParser.java:641)
at org.spark-project.jetty.http.HttpParser.parseAvailable (HttpParser.java:231)
at org.spark-project.jetty.server.AsyncHttpConnection.handle (AsyncHttpConnection.java:82)
at org.spark-project.jetty.io.nio.SelectChannelEndPoint.handle (SelectChannelEndPoint.java:696)
at org.spark-project.jetty.io.nio.SelectChannelEndPoint$1.run (SelectChannelEndPoint.java:53)
at org.spark-project.jetty.util.thread.QueuedThreadPool.runJob (QueuedThreadPool.java:608)
at org.spark-project.jetty.util.thread.QueuedThreadPool$3.run (QueuedThreadPool.java:543)
at java.lang.Thread.run (Thread.java:748)

此外,还有一点值得注意,Chrome 浏览器的某些端口是禁止访问的,所以遇到过有一个 Spark 任务使用了 4045 端口【locked】,在 Chrome 浏览器是看不了任务状态的,页面无法打开,被 Chrome 浏览器屏蔽了,此时并不是 Spark 的问题。

关于 Git 的小问题

1、本地版本落后,而且又与远程仓库冲突,git pull 报错警告,需要 merge,无法直接更新最新版本。下面的操作直接覆盖本地文件,强制更新到最新版本,本地未提交的更改会丢失。

1
2
git fetch --all
git reset --hard origin/master

2、在 2018 年 9 月的某一天,发现 Git 的代码推送总是需要输入帐号和密码,哪怕保存下来也不行,每次 push 都需要重新输入,感觉很奇怪。后来发现是版本太旧了,当时的版本是 v2.13.0,升级后的版本是 v2.18.0,升级后就恢复正常了。后来无意间在哪里看到过通知,说是 TSL 协议升级了,所以针对旧版本强制输入用户名密码,升级就可以解决。

备注一下,HTTPS 是在 TCP 和 HTTP 之间增加了 TLS【Transport Layer Security,传输层安全】,提供了内容加密、身份认证和数据完整性三大功能。TLS 的前身是 SSL【Secure Sockets Layer,安全套接字层】,由网景公司开发,后来被 IETF 标准化并改名。

虾丸派 wechat
扫一扫添加博主,进技术交流群,共同学习进步
永不止步
0%