SpringMVC源码总结(五)Tomcat的URIEncoding、useBodyEncodingForURI和CharacterEncodingFilter

继续上一章节的乱码问题。上一篇文章仅仅说了设置Tomcat的URIEncoding可以解决乱码问题,这篇文章便会讲述这一背后的内容。首先说明下,光看是没用的,要多实验实验。 

目前我的tomcat版本为:7.0.55,spring所有文章的版本始终为4.0.5 

本文章会从tomcat的源码角度来解析Tomcat的两个参数设置URIEncoding和useBodyEncodingForURI。 

对于一个请求,常用的有两种编码方式,如下: 

?


1

2

3

4

5

6

7

8

9

10

11

12

13

<!DOCTYPE html>

<html>

    <head>

        <meta charset="utf-8" />

        <title></title>

    </head>

    <body>

        <form action="http://127.0.0.1:8080/string?name=中国" method="post">

            <input type="text" name="user" value="张三"/>

            <input type="submit" value="提交"/>

        </form>

    </body>

</html>

首先说说结论: 
上述请求有两处含有中文,一处是请求参数中,即?name='中国',另一处是请求体中,即user='张三'。对于这两处tomcat7是分两种编码方式的。URIEncoding就是针对请求参数的编码设置的,而filter的request.setCharacterEncoding('UTF-8')或者请求header中的content-type中的编码都是针对请求体的。不要把他们搞混了。 

useBodyEncodingForURI=true是说,请求参数的编码方式要采用请求体的编码方式。当useBodyEncodingForURI=true时,若请求体采用utf-8解析,则请求参数也要采用utf-8来解析。这两个属性值的设置在tomcat的conf/server.xml文件中配置,如下: 

?


1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

<Service name="Catalina">

 

    <!--The connectors can use a shared executor, you can define one or more named thread pools-->

    <!--

    <Executor name="tomcatThreadPool" namePrefix="catalina-exec-"

        maxThreads="150" minSpareThreads="4"/>

    -->

 

 

    <!-- A "Connector" represents an endpoint by which requests are received

         and responses are returned. Documentation at :

         Java HTTP Connector: /docs/config/http.html (blocking & non-blocking)

         Java AJP  Connector: /docs/config/ajp.html

         APR (HTTP/AJP) Connector: /docs/apr.html

         Define a non-SSL HTTP/1.1 Connector on port 8080

    -->

    <Connector port="8080" protocol="HTTP/1.1"

               connectionTimeout="20000"

               redirectPort="8443" useBodyEncodingForURI='true' URIEncoding='UTF-8' />

    <!-- A "Connector" using the shared thread pool-->

这样写只是说明这两者的配置位置,并不是两个属性要同时配置,不要理解错了。 
继续说说CharacterEncodingFilter的作用。 
使用方式,将如下代码加入web.xml文件中: 

?


1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

<filter>

        <filter-name>encoding</filter-name>

        <filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>

        <init-param>

            <param-name>encoding</param-name>

            <param-value>UTF-8</param-value>

        </init-param>

        <init-param>

            <param-name>forceEncoding</param-name>

            <param-value>true</param-value>

        </init-param>

    </filter>

 

    <filter-mapping>

        <filter-name>encoding</filter-name>

        <url-pattern>/*</url-pattern>

    </filter-mapping>

作用是,当forceEncoding为false的前提下(默认为false),当request没有指定content-type或content-type不含编码时,该filter将会为这个request设定请求体的编码为filter的encoding值。 
当forceEncoding为true的前提下,就会为request的请求体和response都设定为这个filter的encoding值。 
CharacterEncodingFilter源码如下: 

?


1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

public class CharacterEncodingFilter extends OncePerRequestFilter {

 

    private String encoding;

 

    private boolean forceEncoding = false;

 

 

    /**

     * Set the encoding to use for requests. This encoding will be passed into a

     * {@link javax.servlet.http.HttpServletRequest#setCharacterEncoding} call.

     * <p>Whether this encoding will override existing request encodings

     * (and whether it will be applied as default response encoding as well)

     * depends on the {@link #setForceEncoding "forceEncoding"} flag.

     */

    public void setEncoding(String encoding) {

        this.encoding = encoding;

    }

 

    /**

     * Set whether the configured {@link #setEncoding encoding} of this filter

     * is supposed to override existing request and response encodings.

     * <p>Default is "false", i.e. do not modify the encoding if

     * {@link javax.servlet.http.HttpServletRequest#getCharacterEncoding()}

     * returns a non-null value. Switch this to "true" to enforce the specified

     * encoding in any case, applying it as default response encoding as well.

     * <p>Note that the response encoding will only be set on Servlet 2.4+

     * containers, since Servlet 2.3 did not provide a facility for setting

     * a default response encoding.

     */

    public void setForceEncoding(boolean forceEncoding) {

        this.forceEncoding = forceEncoding;

    }

 

 

    @Override

    protected void doFilterInternal(

            HttpServletRequest request, HttpServletResponse response, FilterChain filterChain)

            throws ServletException, IOException {

 

        if (this.encoding != null && (this.forceEncoding || request.getCharacterEncoding() == null)) {

            request.setCharacterEncoding(this.encoding);

            if (this.forceEncoding) {

                response.setCharacterEncoding(this.encoding);

            }

        }

        filterChain.doFilter(request, response);

    }

 

}

这个filter有两个属性,encoding和forceEncoding,我们可以在web.xml文件中来设定这两个属性值。 
每次request请求到来执行方法doFilterInternal,首先调用request.getCharacterEncoding(),本质就是从请求header content-type中获取编码值,如果没有,则调用request.setCharacterEncoding(this.encoding)将该filter的encoding值设置为请求体的编码方式,记住该编码方式只对请求体,不针对请求参数。当forceEncoding=true时,不管请求header content-type有没有编码方式,始终将该filter的encoding值设置到request和response中,同样只针对request的请求体。 

以上的结论说完了,下面就要看看源代码了。不想看的就算了不影响使用,想看看原理的请继续: 

首先是三个名词: 
org.apache.coyote.Request:这是一个最底层的request,包含各种参数信息。暂且称为coyoteRequest。 
org.apache.catalina.connector.Request:实现了HttpServletRequest接口,称它为request,同时包含了一个coyoteRequest,一个connector,待会你就会发现这个connector的编码传递作用。 
org.apache.catalina.connector.RequestFacade:同样实现了HttpServletRequest接口,它仅仅是一个装饰类,称它为requestFacade,构造函数为: 

?


1

2

3

4

5

6

7

8

9

10

/**

     * Construct a wrapper for the specified request.

     *

     * @param request The request to be wrapped

     */

    public RequestFacade(Request request) {

 

        this.request = request;

 

    }

该构造函数将一个org.apache.catalina.connector.Request传进来,requestFacade的工作全是靠它内部的org.apache.catalina.connector.Request来完成的,org.apache.catalina.connector.Request又是依据它所包含的org.apache.coyote.Request这个最底层的类来完成的。通过org.apache.catalina.connector.Request,我们可以设定org.apache.coyote.Request的一些工作方式,如通过什么编码来解析数据。 

org.apache.coyote.Request含有的属性: 
String charEncoding:针对请求体的编码(在第一次解析参数时会传递给Parameters的encoding) 
Parameters :用于处理和存放请求参数和请求体参数的类 
            (1)含String encoding:针对请求体的编码 
            (2)含String queryStringEncoding:针对请求参数的编码 
            (3)含Map<String,ArrayList<String>> paramHashValues:存放解析后的参数 
Parameters的两个编码是最最重要的编码,直接参与解析数据的编码,不像其他对象的编码大部分都是起传递作用,最终作用到Parameters的两个编码上 

?


1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

public class MyCharacterEncodingFilter extends CharacterEncodingFilter{

 

    @Override

    protected void doFilterInternal(HttpServletRequest request,

            HttpServletResponse response, FilterChain filterChain)

            throws ServletException, IOException {

        request.setCharacterEncoding("UTF-8");

        String name=request.getParameter("user");

        System.out.println(name);

        request.setCharacterEncoding("UTF-8");

        String name1=request.getParameter("user");

        System.out.println(name1);

        super.doFilterInternal(request, response, filterChain);

    }

}

传给过滤器filter的HttpServletRequest request其实是org.apache.catalina.connector.RequestFacade类型的,我们看下获取参数的具体过程: 
requestFacade.getParameter("user")会传递到org.apache.catalina.connector.Request的相应方法,如下: 

?


1

2

3

4

5

6

7

8

9

public String getParameter(String name) {

 

        if (!parametersParsed) {

            parseParameters();

        }

 

        return coyoteRequest.getParameters().getParameter(name);

 

    }

parametersParsed是org.apache.catalina.connector.Request的属性,用于标示是否已经解析过参数,如果解析过,便不再解析,直接从coyoteRequest的Parameters参数中取出。所以当已经解析过后,你再去设置编码,会无效的,因为它会直接返回第一次的解析结果。并且解析过程仅仅发生在第一次获取参数的时候。 
我们来看下parseParameters()这个解析参数的过程: 

?


1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

/**

     * Parse request parameters.

     */

    protected void parseParameters() {

 

        //解析发生后,便将是状态置为已解析

        parametersParsed = true;

 

        Parameters parameters = coyoteRequest.getParameters();

        boolean success = false;

        try {

            // Set this every time in case limit has been changed via JMX

            parameters.setLimit(getConnector().getMaxParameterCount());

 

            // getCharacterEncoding() may have been overridden to search for

            // hidden form field containing request encoding

            //重点1

            String enc = getCharacterEncoding();

            //重点2

            boolean useBodyEncodingForURI = connector.getUseBodyEncodingForURI();

            if (enc != null) {

                parameters.setEncoding(enc);

                if (useBodyEncodingForURI) {

                    parameters.setQueryStringEncoding(enc);

                }

            } else {

                parameters.setEncoding

                    (org.apache.coyote.Constants.DEFAULT_CHARACTER_ENCODING);

                if (useBodyEncodingForURI) {

                    parameters.setQueryStringEncoding

                        (org.apache.coyote.Constants.DEFAULT_CHARACTER_ENCODING);

                }

            }

            //重点3

            parameters.handleQueryParameters();

 

            if (usingInputStream || usingReader) {

                success = true;

                return;

            }

 

            if( !getConnector().isParseBodyMethod(getMethod()) ) {

                success = true;

                return;

            }

 

            String contentType = getContentType();

            if (contentType == null) {

                contentType = "";

            }

            int semicolon = contentType.indexOf(';');

            if (semicolon >= 0) {

                contentType = contentType.substring(0, semicolon).trim();

            } else {

                contentType = contentType.trim();

            }

 

            if ("multipart/form-data".equals(contentType)) {

                parseParts();

                success = true;

                return;

            }

 

            if (!("application/x-www-form-urlencoded".equals(contentType))) {

                success = true;

                return;

            }

 

            int len = getContentLength();

 

            if (len > 0) {

                int maxPostSize = connector.getMaxPostSize();

                if ((maxPostSize > 0) && (len > maxPostSize)) {

                    if (context.getLogger().isDebugEnabled()) {

                        context.getLogger().debug(

                                sm.getString("coyoteRequest.postTooLarge"));

                    }

                    checkSwallowInput();

                    return;

                }

                byte[] formData = null;

                if (len < CACHED_POST_LEN) {

                    if (postData == null) {

                        postData = new byte[CACHED_POST_LEN];

                    }

                    formData = postData;

                } else {

                    formData = new byte[len];

                }

                try {

                    if (readPostBody(formData, len) != len) {

                        return;

                    }

                } catch (IOException e) {

                    // Client disconnect

                    if (context.getLogger().isDebugEnabled()) {

                        context.getLogger().debug(

                                sm.getString("coyoteRequest.parseParameters"), e);

                    }

                    return;

                }

               //重点4

                parameters.processParameters(formData, 0, len);

            } else if ("chunked".equalsIgnoreCase(

                    coyoteRequest.getHeader("transfer-encoding"))) {

                byte[] formData = null;

                try {

                    formData = readChunkedPostBody();

                } catch (IOException e) {

                    // Client disconnect or chunkedPostTooLarge error

                    if (context.getLogger().isDebugEnabled()) {

                        context.getLogger().debug(

                                sm.getString("coyoteRequest.parseParameters"), e);

                    }

                    return;

                }

                if (formData != null) {

                    parameters.processParameters(formData, 0, formData.length);

                }

            }

            success = true;

        } finally {

            if (!success) {

                parameters.setParseFailed(true);

            }

        }

 

    }

上面有四处我们需要关注的重点。 

重点1:getCharacterEncoding()其实是通过底层的coyoteRequest来获取header content-type中的编码。 
如下: 

?


1

2

3

4

5

6

7

/**

     * Return the character encoding for this Request.

     */

    @Override

    public String getCharacterEncoding() {

      return coyoteRequest.getCharacterEncoding();

    }

?


1

2

3

4

5

6

7

8

9

public String getCharacterEncoding() {

 

        if (charEncoding != null)

            return charEncoding;

 

        charEncoding = ContentType.getCharsetFromContentType(getContentType());

        return charEncoding;

 

    }

若无,则返回空。 

重点2: 
boolean useBodyEncodingForURI = connector.getUseBodyEncodingForURI();这里就是我们在tomcat的server中配置的useBodyEncodingForURI属性的值。 

常量值org.apache.coyote.Constants.DEFAULT_CHARACTER_ENCODING="ISO-8859-1"; 

当重点1中的enc为空时,则会设置底层coyoteRequest的parameters对象的encoding=s上述"ISO-8859-1",即请求体采用"ISO-8859-1"来解析。当useBodyEncodingForURI=true时,请求参数和请求体的编码设置的都是同一个值,即"ISO-8859-1"。当useBodyEncodingForURI=false时,不改变queryStringEncoding即请求参数的编码,queryStringEncoding默认是为null的,当解析时碰见queryStringEncoding也会采用默认的编码"ISO-8859-1",然而我们可以通过org.apache.catalina.connector.Request所包含的connector配置来给queryStringEncoding赋值。如下: 
当你在tomcat的server.xml文件中加入URIEncoding="UTF-8"时,它将会为queryStringEncoding赋值此值。 
在tomcat的server.xml中配置此值 

?


1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

<Service name="Catalina">

 

    <!--The connectors can use a shared executor, you can define one or more named thread pools-->

    <!--

    <Executor name="tomcatThreadPool" namePrefix="catalina-exec-"

        maxThreads="150" minSpareThreads="4"/>

    -->

 

 

    <!-- A "Connector" represents an endpoint by which requests are received

         and responses are returned. Documentation at :

         Java HTTP Connector: /docs/config/http.html (blocking & non-blocking)

         Java AJP  Connector: /docs/config/ajp.html

         APR (HTTP/AJP) Connector: /docs/apr.html

         Define a non-SSL HTTP/1.1 Connector on port 8080

    -->

    <Connector port="8080" protocol="HTTP/1.1"

               connectionTimeout="20000"

               redirectPort="8443" URIEncoding='UTF-8'/>

connector将这个值为queryStringEncoding赋值的过程如下: 

?


1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

public void log(org.apache.coyote.Request req,

            org.apache.coyote.Response res, long time) {

 

        Request request = (Request) req.getNote(ADAPTER_NOTES);

        Response response = (Response) res.getNote(ADAPTER_NOTES);

 

        if (request == null) {

            // Create objects

            request = connector.createRequest();

            request.setCoyoteRequest(req);

            response = connector.createResponse();

            response.setCoyoteResponse(res);

 

            // Link objects

            request.setResponse(response);

            response.setRequest(request);

 

            // Set as notes

            req.setNote(ADAPTER_NOTES, request);

            res.setNote(ADAPTER_NOTES, response);

 

            // Set query string encoding

            //重点重点重点重点重点重点重点重点重点重点重点重点重点重点重点

            req.getParameters().setQueryStringEncoding

                (connector.getURIEncoding());

        }

connector.getURIEncoding()便是我们配置的URIEncoding值 
req.getParameters().setQueryStringEncoding 
                (connector.getURIEncoding()); 
这句代码便是将我们在tomcat的server.xml文件中配置的URIEncoding值设置进最重要的Parameters的queryStringEncoding中。 

当重点1中的enc不为空时,为Parameters请求体的的编码encoding设置为enc。 
至此,Parameters的encoding和queryStringEncoding都有相应的值了,然后便按照对应的编码来解析字节数组。 

重点3和4:有个相应的编码方式,分别执行请求参数的解析过程和请求体的解析过程。 

总结下一些设置的作用: 

request.setCharacterEncoding(encoding) :暴漏给我们的request为requestFacade,最终调用request->调用coyoteRequest->设置到coyoteRequest的charEncoding,所以coyoteRequest的charEncoding有两种来源,一种可能是content-type中的编码,另一种就是调用request.setCharacterEncoding(encoding) 方法。此方法最好在第一次解析参数之前调用,不然就无效。 

URIEncoding:直接设置Parameters的queryStringEncoding的值。即针对请求参数的编码。 

useBodyEncodingForURI:设置queryStringEncoding的值=encoding的值,即请求参数采用请求体的编码方式。 

时间: 2024-11-03 20:16:11

SpringMVC源码总结(五)Tomcat的URIEncoding、useBodyEncodingForURI和CharacterEncodingFilter的相关文章

SpringMVC源码解析- HandlerAdapter - ModelFactory(转)

ModelFactory主要是两个职责: 1. 初始化model 2. 处理器执行后将modle中相应参数设置到SessionAttributes中   我们来看看具体的处理逻辑(直接充当分析目录): 1. 初始化model 1.1 解析类上使用的sessionAttributres,将获取参数合并到mavContainer中 1.2 执行注解了@ModelAttribute的方法,并将结果同步到Model 参数名的生成规则:@ModelAttribute中定义的value > 方法的返回类型决

SpringMVC源码解读之HandlerMapping - AbstractUrlHandlerMapping系列request分发_java

AbstractHandlerMapping实现HandlerMapping接口定的getHandler 1. 提供getHandlerInternal模板方法给子类实现      2. 如果没有获取Handler,则使用默认的defaultHandler 3. 如果handler是string类型,从context获取实例 4. 通过getHandlerExecutionChain封装handler,添加interceptor // AbstractHandlerMapping /** * L

SpringMVC源码解读之HandlerMapping_java

概述 对于Web开发者,MVC模型是大家再熟悉不过的了,SpringMVC中,满足条件的请求进入到负责请求分发的DispatcherServlet,DispatcherServlet根据请求url到控制器的映射(HandlerMapping中保存),HandlerMapping最终返回HandlerExecutionChain,其中包含了具体的处理对象handler(也即我们编程时写的controller)以及一系列的拦截器interceptors,此时DispatcherServlet会根据返

SpringMVC源码解读之 HandlerMapping - AbstractDetectingUrlHandlerMapping系列初始化_java

 AbstractDetectingUrlHandlerMapping是通过扫描方式注册Handler,收到请求时由AbstractUrlHandlerMapping的getHandlerInternal进行分发. 共有5个子类,一个抽象类. 与SimpleUrlHandlerMapping类似,通过覆写initApplicationContext,然后调用detectHandlers进行初始化. detectHandlers通过BeanFactoryUtils扫描应用下的Object,然后预留

SpringMVC源码总结(八)类型转换PropertyEditor的背后

PropertyEditor是Spring最初采用的转换策略.将会转移到Converter上.本文章主要对@InitBinder注解背后代码层面的运行过程做介绍.所以最好先熟悉它的用法然后来看通代码流程.  先看实例,controller代码如下:  ? 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 @Controller public class FormAction{       // 这样的方法里,一般是用来注册一些Pro

SpringMVC源码总结(一)HandlerMapping和HandlerAdapter入门

刚接触SpringMVC,对它的xml文件配置一直比较模模糊糊,最近花了一点时间稍微看了下源代码,再加上调试,开始逐渐理解它,网上的类似的内容有很多,写本文主要是自己加深一下理解.本文适合用过SpringMVC的开发者,言归正传,首先搭建一个最简单的工程体验一下.  该工程是基于maven的,pom配置不再说明,所使用的spring版本4.0.5.  首先是web.xml文件配置,最简单的配置  ? 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 <!DOCT

Acegi源码研究(五):七剑下天山

在Acegi初体验及初解剖(http://rmn190.javaeye.com/blog/332711)里, 通过对web.xml和applicationContext-acegi-security.xml的跟踪,我们得出被Acegi拦截下的请求最终交到了filterInvocationDefinitionSource配置下的几个Filter的实现类来处理. 它们是怎么处理这个请求的呢? 在Acegi(三): Acegi? Who are you? ,我们听说江湖中有"七剑", 但这么

SpringMVC源码总结(十)自定义HandlerMethodArgumentResolver

上一篇文章介绍了HandlerMethodArgumentResolver的来龙去脉,这篇就要说说自定义HandlerMethodArgumentResolver来解决我们的需求,本文提供了四种解决方案.  需求,有一个Teacher类和Student类,他们都有属性name和age:  前端form表单为:  ? 1 2 3 4 5 6 7 <form action="/test/two" method="post" >             <

SpringMVC源码解析 - HandlerMethod

HandlerMethod及子类主要用于封装方法调用相关信息,子类还提供调用,参数准备和返回值处理的职责. 分析下各个类的职责吧(顺便做分析目录): HandlerMethod 封装方法定义相关的信息,如类,方法,参数等. 使用场景:HandlerMapping时会使用 InvocableHandlerMethod 添加参数准备,方法调用功能 使用场景:执行使用@ModelAttribute注解会使用 ServletInvocableHandlerMethod 添加返回值处理职责,Respons