Performance Tuning

@Sander_Mak


branchandbound.net
Hibernate sucks!
      ... because it’s slow
Hibernate sucks!
      ... because it’s slow
     ‘The problem is sort of cultural [..] developers
     use Hibernate because they are uncomfortable
     with SQL and with RDBMSes. You should be
     very comfortable with SQL and JDBC before you
     start using Hibernate - Hibernate builds on
     JDBC, it doesn’t replace it. That is the cost of
     extra abstraction [..] save yourself effort,
     pay attention to the database at all stages
     of development.’
                               - Gavin King (creator)
‘Most of the performance problems we have come
up against have been solved not by code
optimizations, but by adding new functionality.’

                       - Gavin King (creator)
‘You can't communicate complexity,
only an awareness of it.’

     - Alan J. Perlis (1st Turing Award winner)
Outline
 Optimization
 Lazy loading
 Examples ‘from the trenches’
   Search queries
   Large collections
   Batching
 Odds & Ends
Optimization
Optimization is hard
 Performance blame
   Framework vs. You
 When to optimize?

 Preserve correctness at all times
   Unit tests ok, but not enough
   Automated integration tests
Optimization is hard
 Performance blame
   Framework vs. You
 When to optimize?

 Preserve correctness at all times
   Unit tests ok, but not enough
   Automated integration tests

  Premature optimization is the root of all evil
                                     - Donald Knuth
Optimization guidelines
Measurement
 Ensure stable, production-like environment
 Measure time and space
   Time: isolate timings in different layers
   Space: more heap -> longer GC -> slower
 Try to measure in RDBMS as well
   IO statistics (hot cache or disk thrashing?)
   Query plans
 Make many measurements -> automation
Optimization guidelines
Practical
 Profiler on DAO/Session.query() methods
 VisualVM etc. for heap usage
     many commercial tools also have
     built-in JDBC profiling
 Hibernate JMX
  <property name="hibernate.generate_statistics">true
  </property>




 RDBMS monitoring tools
Analyzing Hibernate
Log SQL:   <property name="show_sql">true</property>
           <property name="format_sql">true</property>


Log4J configuration:
  org.hibernate.SQL -> DEBUG
  org.hibernate.type -> TRACE (see bound params)

Or use P6Spy/Log4JDBC on JDBC connection
Analyzing Hibernate
    2011-07-28 09:57:12,061 DEBUG org.hibernate.SQL - insert into BASKET_LINE_ALLOC (LAST_UPDATED,      QUANTITY,
    CUSTOMER_REF, NOTES, BRANCH_ID, FUND_ID, TEMPLATE_ID,
    BASKET_LINE_ALLOC_ID) values (?, ?, ?, ?, ?, ?, ?, ?)


Log SQL:
    2011-07-28 09:57:12,081 DEBUG org.hibernate.type.TimestampType - binding '2006-07-28 09:57:12' to   parameter: 1
    2011-07-28 09:57:12,081 DEBUG org.hibernate.type.IntegerType - binding '3' to parameter: 2
                        <property name="show_sql">true</property>
    2011-07-28 09:57:12,082 DEBUG org.hibernate.type.StringType - binding '' to parameter: 3
    2011-07-28 09:57:12,082 DEBUG org.hibernate.type.StringType - binding '' to parameter: 4

                        <property name="format_sql">true</property>
    2011-07-28 09:57:12,082 DEBUG org.hibernate.type.LongType - binding '511' to parameter: 5
    2011-07-28 09:57:12,082 DEBUG org.hibernate.type.LongType - binding '512' to parameter: 6
    2011-07-28 09:57:12,082 DEBUG org.hibernate.type.LongType - binding null to parameter: 7
    2011-07-28 09:57:12,082 DEBUG org.hibernate.type.LongType - binding '180030' to parameter: 8
    Hibernate: INSERT INTO mkyong.stock_transaction (CHANGE, CLOSE, DATE, OPEN, STOCK_ID, VOLUME)

Log4J configuration:
    VALUES (?, ?, ?, ?, ?, ?)
    2011-07-28 13:33:07,253 DEBUG FloatType:133 - binding '10.0' to parameter: 1
    2011-07-28 13:33:07,253 DEBUG FloatType:133 - binding '1.1' to parameter: 2
    2011-07-28 13:33:07,253 DEBUG DateType:133 - binding '30 December 2009' to parameter: 3


  org.hibernate.SQL -> DEBUG
    2011-07-28 13:33:07,269 DEBUG FloatType:133 - binding '1.2' to parameter: 4
    2011-07-28 13:33:07,269 DEBUG IntegerType:133 - binding '11' to parameter: 5
    2011-07-28 13:33:07,269 DEBUG LongType:133 - binding '1000000' to parameter: 6
    2011-07-28 09:57:12,061 DEBUG org.hibernate.SQL - insert into BASKET_LINE_ALLOC (LAST_UPDATED,      QUANTITY,
    CUSTOMER_REF, NOTES, BRANCH_ID, FUND_ID, TEMPLATE_ID,

  org.hibernate.type -> TRACE (see bound params)
    BASKET_LINE_ALLOC_ID) values (?, ?, ?, ?, ?, ?, ?, ?)
    2011-07-28 09:57:12,081 DEBUG org.hibernate.type.TimestampType - binding '2006-07-28 09:57:12' to
    2011-07-28 09:57:12,081 DEBUG org.hibernate.type.IntegerType - binding '3' to parameter: 2
                                                                                                        parameter: 1

    2011-07-28 09:57:12,082 DEBUG org.hibernate.type.StringType - binding '' to parameter: 3
    2011-07-28 09:57:12,082 DEBUG org.hibernate.type.StringType - binding '' to parameter: 4
    2011-07-28 09:57:12,082 DEBUG org.hibernate.type.LongType - binding '511' to parameter: 5
    2011-07-28 09:57:12,082 DEBUG org.hibernate.type.LongType - binding '512' to parameter: 6
    2011-07-28 09:57:12,082 DEBUG org.hibernate.type.LongType - binding null to parameter: 7


Or use P6Spy/Log4JDBC on JDBC connection
    2011-07-28 09:57:12,082 DEBUG org.hibernate.type.LongType - binding '180030' to parameter: 8
    Hibernate: INSERT INTO mkyong.stock_transaction (CHANGE, CLOSE, DATE, OPEN, STOCK_ID, VOLUME)
    VALUES (?, ?, ?, ?, ?, ?)
    2011-07-28 13:33:07,253 DEBUG FloatType:133 - binding '10.0' to parameter: 1
    2011-07-28 13:33:07,253 DEBUG FloatType:133 - binding '1.1' to parameter: 2
    2011-07-28 13:33:07,253 DEBUG DateType:133 - binding '30 December 2009' to parameter: 3
    2011-07-28 13:33:07,269 DEBUG FloatType:133 - binding '1.2' to parameter: 4
    2011-07-28 13:33:07,269 DEBUG IntegerType:133 - binding '11' to parameter: 5
    2011-07-28 13:33:07,269 DEBUG LongType:133 - binding '1000000' to parameter: 6
Lazy loading
Lazy loading
One entity to rule them all                   Request

Mostly sane defaults:                                  1..1




@OneToOne,
@OneToMany,                                      User
@ManyToMany: LAZY                             1..*          *..1


@ManyToOne : EAGER
(Due to JPA spec.)            Authorization


Extra-lazy: Hibernate                            *..1



specific                         Global
                               Company
                                                     1..*          Company
LazyInitializationException
Lazy loading          N+1 Selects problem

Select list of N users      HQL: SELECT u FROM User

Authorizations necessary:   SQL 1 query:
                            SELECT * FROM User
  N select queries on         LEFT JOIN Company c
  Authorization executed!     WHERE u.worksForCompany =
                              c.id
Solution:
  FETCH JOINS
  @Fetch(FetchMode.JOI
  N)
  @FetchProfile (enable
  per session
  don’t call .size()
Lazy loading          N+1 Selects problem

Select list of N users      HQL: SELECT u FROM User

Authorizations necessary:   SQL N queries:
                            SELECT * FROM Authorization
  N select queries on         WHERE userId = N
  Authorization executed!
Solution:
  FETCH JOINS
  @Fetch(FetchMode.JOI
  N)
  @FetchProfile (enable
  per session
  don’t call .size()
Lazy loading          N+1 Selects problem
                            HQL: SELECT u FROM User OUTER
Select list of N users      JOIN FETCH u.authorizations
Authorizations necessary:
                            SQL 1 query:
  N select queries on       SELECT * FROM User
  Authorization executed!     LEFT JOIN Company c LEFT
                              OUTER JOIN Authorization
Solution:                     ON .. WHERE
                              u.worksForCompany = c.id
  FETCH JOINS
  @Fetch(FetchMode.JOI
  N)
  @FetchProfile (enable
  per session
  don’t call .size()
Lazy loading
Some guidelines
 Laziness by default = mostly good
 However, architectural impact:
   Session lifetime (‘OpenSessionInView’ pattern)
   Extended Persistence Context
   Proxy usage (runtime code generation)
 Eagerness can be forced with HQL/JPAQL/Criteria
 But eagerness cannot be reverted
   exception: Session.load()/EntityManager.getReference()
Search queries
Search queries



                                     Search

                   User



                                                Result list
                1..*          *..1




Authorization

                   *..1


  Global               1..*           Company
 Company

                                                              Detail
Search queries
Obvious solution:

Too much information!
Use summary objects:



UserSummary = POJO
  not attached, only necessary fields (no relations)
Search queries
Obvious solution:

Too much information!
Use summary objects:



UserSummary = POJO
  not attached, only necessary fields (no relations)
  Or: drop down to JDBC to fetch id + fields
Search queries
Alternative:

Taking it further:




Pagination in queries, not in app. code
Extra count query may be necessary (totals)
         Ordering necessary for paging!
Search queries
          Subtle:
Alternative: effect of applying setMaxResults
          “The
           or setFirstResult to a query involving
         fetch joins over collections is undefined”
Taking it further:
         Cause: the emitted join possibly returns
             several rows per entity. So LIMIT,
            rownums, TOP cannot be used!
          Instead Hibernate must fetch all rows
         WARNING: firstResult/maxResults specified with
Pagination in queries, not in app. code
                  collection fetch; applying in memory!


Extra count query may be necessary (totals)
         Ordering necessary for paging!
Analytics & reporting
  ORM less relevant: no entities, but complex
  aggregations
  Simple sum/avg/counts possible over entities

  Specialized db calls for complex reports:
     Partitioning/windowing etc.
  Integrate using native query
  Or create a database view and map entity
Large collections
Large collections
      Frontend:
       Request
     CompanyGroup
          1..*


                             Backend:
        Company
                            CompanyGroup
                                  1..*
                                             Meta-data


Company read-only entity,   CompanyInGroup
backed by expensive view          *..1




                               Company
Large collections
  Frontend:
   Request
 CompanyGroup
      1..*


                    Backend:
   Company
                    CompanyGroup
                         1..*
                                   Meta-data

                CompanyInGroup
                         *..1




                      Company
Large collections
  Frontend:
   Request
 CompanyGroup
      1..*


                    Backend:
   Company
                    CompanyGroup
                         1..*
                                   Meta-data

                CompanyInGroup
                         *..1




                      Company
Large collections
 Opening large groups sluggish
 Improved performance:

 Fetches many uninitialized collections in 1 query
 Also possible on entity:

                                                 Request
                                               CompanyGroup
                                                     1..*




                                                 Company
Large collections
 Opening large groups sluggish
 Improved performance:

 Fetches many uninitialized collections in 1 query
 Also possible on entity:

                                                    Request
                                                  CompanyGroup
                                                       1..*




                                                    Company
       Better solution in hindsight: fetch join
Large collections
 Extra lazy collection fetching



 Efficient:
   companies.size() -> count query
   companies.contains() -> select 1 where ...
   companies.get(n) -> select * where index = n
Large collections
 Saving large group slow: >15 sec.
 Problem: Hibernate inserts row by row
   Query creation overhead, network latency
 Solution: <property name="hibernate.jdbc.batch_size">100
              </property>


 Enables JDBC batched statements
 Caution: global property                                 Request
                                                        CompanyGroup

 Also: <property name="hibernate.order_inserts">true
                                                              1..*




        </property>                                         Company
        <property name="hibernate.order_updates">true
        </property>
Large collections
  Frontend:
   Request
 CompanyGroup
      1..*


                    Backend:
   Company
                    CompanyGroup
                         1..*
                                   Meta-data

                CompanyInGroup
                         *..1




                      Company
Large collections


                    Backend:

                    CompanyGroup
                         1..*
                                   Meta-data

                CompanyInGroup
                         *..1




                      Company
Large collections
Process   CreateGroup (Soap)   Business
Service                         Service

 CreateGroup: ~10 min. for thousands of companies
 @BatchSize on Company improved demarshalling
 JDBC batch_size property marginal improvement
 INFO: INSERT INTO CompanyInGroup VALUES (?,...,?)
 INFO: SELECT @identity
 INFO: INSERT INTO CompanyInGroup VALUES (?,...,?)
 INFO: SELECT @identity
 .. 1000 times                                       CompanyGroup
                                                           1..*
                                                                      Meta-data


 Insert/select interleaved: due to gen. id           CompanyInGroup
                                                           *..1




                                                        Company
Large collections
Process    CreateGroup (Soap)   Business
Service                          Service

 Solution: generate id in app. (not always feasible)
 Running in ~3 minutes with batched inserts
 Next problem: heap usage spiking
 Use StatelessSession
  ✦ Bypass first-level cache
  ✦ No automatic dirty checking                     CompanyGroup

  ✦ Bypass Hibernate event model and interceptors         1..*
                                                                     Meta-data

  ✦ No cascading of operations                      CompanyInGroup
  ✦ Collections on entities are ignored                   *..1




                                                       Company
Large collections
Process   CreateGroup (Soap)   Business
Service                         Service

 Solution: generate id in app. (not always feasible)
 Running in ~3 minutes with batched inserts
 Next problem: heap usage spiking
 Use StatelessSession

                                              CompanyGroup
                                                    1..*
                                                               Meta-data

                                              CompanyInGroup
                                                    *..1




                                                 Company
Large collections
Process   CreateGroup (Soap)   Business
Service                         Service

 Now <1 min., everybody happy!




                                          CompanyGroup
                                                1..*
                                                           Meta-data

                                          CompanyInGroup
                                                *..1




                                             Company
Large collections
Process   CreateGroup (Soap)   Business
Service                         Service

 Now <1 min., everybody happy!


      Data loss detected!

                                          CompanyGroup
                                                1..*
                                                           Meta-data

                                          CompanyInGroup
                                                *..1




                                             Company
Large collections
Process   CreateGroup (Soap)   Business
Service                         Service   Data loss detected!

 StatelessSession and JDBC batch_size bug




 HHH-4042: Closed, won’t fix :
                                                    CompanyGroup
                                                         1..*
                                                                    Meta-data

                                                   CompanyInGroup
                                                         *..1




                                                      Company
Odds & Ends
Dirty little secret
                                            validated(item) performs
                                            read-only queries


                          select   currentItem from Catalog where ..
 Dirty collection after   select   spendingLimit from User where ..
 each iteration           insert   into Item values (?, ?, ?)
                          select   currentItem from Catalog where ..
                          select   spendingLimit from User where ..
 Batching fails           insert   into Item values (?, ?, ?)

 Flushmode.AUTO

 Loops always suspect: relational, set-based thinking
Dirty little secret
                                  validated(item) performs
                                  read-only queries



 Dirty collection after
 each iteration
 Batching fails
 Flushmode.AUTO

 Loops always suspect: relational, set-based thinking
Query hints
Speed up read-only service calls:



Hibernate Query.setHint():




Also: never use 2nd level cache just ‘because we can’
Query hints
Speed up read-only service calls:



Hibernate Query.setHint():




Also: never use 2nd level cache just ‘because we can’
     @org.hibernate.annotations.Immutable
Large updates
Naive approach:



Entities are not always necessary:



Changes are not reflected in persistence context
With optimistic concurrency: VERSIONED keyword
Large updates
Naive approach:



Entities are not always necessary:



Changes are not reflected in persistence context
With optimistic concurrency: VERSIONED keyword
       Consider use of stored procedures
Cherish your database
Data and schema outlive your application
Good indexes make a world of difference
Stored procedures etc. are not inherently evil
Do not let Hibernate dictate your schema
  Befriend a DBA instead!

There are other solutions (there I said it)
  MyBatis
  Squeryl (Scala)
Thanks for listening!


@Sander_Mak
                         Join me later today:
                     Elevate your webapps
                        with Scala & Lift!

                         17:00 Room C
branchandbound.net

Hibernate Performance Tuning (JEEConf 2012)

  • 1.
  • 2.
    Hibernate sucks! ... because it’s slow
  • 3.
    Hibernate sucks! ... because it’s slow ‘The problem is sort of cultural [..] developers use Hibernate because they are uncomfortable with SQL and with RDBMSes. You should be very comfortable with SQL and JDBC before you start using Hibernate - Hibernate builds on JDBC, it doesn’t replace it. That is the cost of extra abstraction [..] save yourself effort, pay attention to the database at all stages of development.’ - Gavin King (creator)
  • 5.
    ‘Most of theperformance problems we have come up against have been solved not by code optimizations, but by adding new functionality.’ - Gavin King (creator)
  • 6.
    ‘You can't communicatecomplexity, only an awareness of it.’ - Alan J. Perlis (1st Turing Award winner)
  • 7.
    Outline Optimization Lazyloading Examples ‘from the trenches’ Search queries Large collections Batching Odds & Ends
  • 8.
  • 9.
    Optimization is hard Performance blame Framework vs. You When to optimize? Preserve correctness at all times Unit tests ok, but not enough Automated integration tests
  • 10.
    Optimization is hard Performance blame Framework vs. You When to optimize? Preserve correctness at all times Unit tests ok, but not enough Automated integration tests Premature optimization is the root of all evil - Donald Knuth
  • 11.
    Optimization guidelines Measurement Ensurestable, production-like environment Measure time and space Time: isolate timings in different layers Space: more heap -> longer GC -> slower Try to measure in RDBMS as well IO statistics (hot cache or disk thrashing?) Query plans Make many measurements -> automation
  • 12.
    Optimization guidelines Practical Profileron DAO/Session.query() methods VisualVM etc. for heap usage many commercial tools also have built-in JDBC profiling Hibernate JMX <property name="hibernate.generate_statistics">true </property> RDBMS monitoring tools
  • 13.
    Analyzing Hibernate Log SQL: <property name="show_sql">true</property> <property name="format_sql">true</property> Log4J configuration: org.hibernate.SQL -> DEBUG org.hibernate.type -> TRACE (see bound params) Or use P6Spy/Log4JDBC on JDBC connection
  • 14.
    Analyzing Hibernate 2011-07-28 09:57:12,061 DEBUG org.hibernate.SQL - insert into BASKET_LINE_ALLOC (LAST_UPDATED, QUANTITY, CUSTOMER_REF, NOTES, BRANCH_ID, FUND_ID, TEMPLATE_ID, BASKET_LINE_ALLOC_ID) values (?, ?, ?, ?, ?, ?, ?, ?) Log SQL: 2011-07-28 09:57:12,081 DEBUG org.hibernate.type.TimestampType - binding '2006-07-28 09:57:12' to parameter: 1 2011-07-28 09:57:12,081 DEBUG org.hibernate.type.IntegerType - binding '3' to parameter: 2 <property name="show_sql">true</property> 2011-07-28 09:57:12,082 DEBUG org.hibernate.type.StringType - binding '' to parameter: 3 2011-07-28 09:57:12,082 DEBUG org.hibernate.type.StringType - binding '' to parameter: 4 <property name="format_sql">true</property> 2011-07-28 09:57:12,082 DEBUG org.hibernate.type.LongType - binding '511' to parameter: 5 2011-07-28 09:57:12,082 DEBUG org.hibernate.type.LongType - binding '512' to parameter: 6 2011-07-28 09:57:12,082 DEBUG org.hibernate.type.LongType - binding null to parameter: 7 2011-07-28 09:57:12,082 DEBUG org.hibernate.type.LongType - binding '180030' to parameter: 8 Hibernate: INSERT INTO mkyong.stock_transaction (CHANGE, CLOSE, DATE, OPEN, STOCK_ID, VOLUME) Log4J configuration: VALUES (?, ?, ?, ?, ?, ?) 2011-07-28 13:33:07,253 DEBUG FloatType:133 - binding '10.0' to parameter: 1 2011-07-28 13:33:07,253 DEBUG FloatType:133 - binding '1.1' to parameter: 2 2011-07-28 13:33:07,253 DEBUG DateType:133 - binding '30 December 2009' to parameter: 3 org.hibernate.SQL -> DEBUG 2011-07-28 13:33:07,269 DEBUG FloatType:133 - binding '1.2' to parameter: 4 2011-07-28 13:33:07,269 DEBUG IntegerType:133 - binding '11' to parameter: 5 2011-07-28 13:33:07,269 DEBUG LongType:133 - binding '1000000' to parameter: 6 2011-07-28 09:57:12,061 DEBUG org.hibernate.SQL - insert into BASKET_LINE_ALLOC (LAST_UPDATED, QUANTITY, CUSTOMER_REF, NOTES, BRANCH_ID, FUND_ID, TEMPLATE_ID, org.hibernate.type -> TRACE (see bound params) BASKET_LINE_ALLOC_ID) values (?, ?, ?, ?, ?, ?, ?, ?) 2011-07-28 09:57:12,081 DEBUG org.hibernate.type.TimestampType - binding '2006-07-28 09:57:12' to 2011-07-28 09:57:12,081 DEBUG org.hibernate.type.IntegerType - binding '3' to parameter: 2 parameter: 1 2011-07-28 09:57:12,082 DEBUG org.hibernate.type.StringType - binding '' to parameter: 3 2011-07-28 09:57:12,082 DEBUG org.hibernate.type.StringType - binding '' to parameter: 4 2011-07-28 09:57:12,082 DEBUG org.hibernate.type.LongType - binding '511' to parameter: 5 2011-07-28 09:57:12,082 DEBUG org.hibernate.type.LongType - binding '512' to parameter: 6 2011-07-28 09:57:12,082 DEBUG org.hibernate.type.LongType - binding null to parameter: 7 Or use P6Spy/Log4JDBC on JDBC connection 2011-07-28 09:57:12,082 DEBUG org.hibernate.type.LongType - binding '180030' to parameter: 8 Hibernate: INSERT INTO mkyong.stock_transaction (CHANGE, CLOSE, DATE, OPEN, STOCK_ID, VOLUME) VALUES (?, ?, ?, ?, ?, ?) 2011-07-28 13:33:07,253 DEBUG FloatType:133 - binding '10.0' to parameter: 1 2011-07-28 13:33:07,253 DEBUG FloatType:133 - binding '1.1' to parameter: 2 2011-07-28 13:33:07,253 DEBUG DateType:133 - binding '30 December 2009' to parameter: 3 2011-07-28 13:33:07,269 DEBUG FloatType:133 - binding '1.2' to parameter: 4 2011-07-28 13:33:07,269 DEBUG IntegerType:133 - binding '11' to parameter: 5 2011-07-28 13:33:07,269 DEBUG LongType:133 - binding '1000000' to parameter: 6
  • 15.
  • 16.
    Lazy loading One entityto rule them all Request Mostly sane defaults: 1..1 @OneToOne, @OneToMany, User @ManyToMany: LAZY 1..* *..1 @ManyToOne : EAGER (Due to JPA spec.) Authorization Extra-lazy: Hibernate *..1 specific Global Company 1..* Company
  • 18.
  • 19.
    Lazy loading N+1 Selects problem Select list of N users HQL: SELECT u FROM User Authorizations necessary: SQL 1 query: SELECT * FROM User N select queries on LEFT JOIN Company c Authorization executed! WHERE u.worksForCompany = c.id Solution: FETCH JOINS @Fetch(FetchMode.JOI N) @FetchProfile (enable per session don’t call .size()
  • 20.
    Lazy loading N+1 Selects problem Select list of N users HQL: SELECT u FROM User Authorizations necessary: SQL N queries: SELECT * FROM Authorization N select queries on WHERE userId = N Authorization executed! Solution: FETCH JOINS @Fetch(FetchMode.JOI N) @FetchProfile (enable per session don’t call .size()
  • 21.
    Lazy loading N+1 Selects problem HQL: SELECT u FROM User OUTER Select list of N users JOIN FETCH u.authorizations Authorizations necessary: SQL 1 query: N select queries on SELECT * FROM User Authorization executed! LEFT JOIN Company c LEFT OUTER JOIN Authorization Solution: ON .. WHERE u.worksForCompany = c.id FETCH JOINS @Fetch(FetchMode.JOI N) @FetchProfile (enable per session don’t call .size()
  • 22.
    Lazy loading Some guidelines Laziness by default = mostly good However, architectural impact: Session lifetime (‘OpenSessionInView’ pattern) Extended Persistence Context Proxy usage (runtime code generation) Eagerness can be forced with HQL/JPAQL/Criteria But eagerness cannot be reverted exception: Session.load()/EntityManager.getReference()
  • 23.
  • 24.
    Search queries Search User Result list 1..* *..1 Authorization *..1 Global 1..* Company Company Detail
  • 25.
    Search queries Obvious solution: Toomuch information! Use summary objects: UserSummary = POJO not attached, only necessary fields (no relations)
  • 26.
    Search queries Obvious solution: Toomuch information! Use summary objects: UserSummary = POJO not attached, only necessary fields (no relations) Or: drop down to JDBC to fetch id + fields
  • 27.
    Search queries Alternative: Taking itfurther: Pagination in queries, not in app. code Extra count query may be necessary (totals) Ordering necessary for paging!
  • 28.
    Search queries Subtle: Alternative: effect of applying setMaxResults “The or setFirstResult to a query involving fetch joins over collections is undefined” Taking it further: Cause: the emitted join possibly returns several rows per entity. So LIMIT, rownums, TOP cannot be used! Instead Hibernate must fetch all rows WARNING: firstResult/maxResults specified with Pagination in queries, not in app. code collection fetch; applying in memory! Extra count query may be necessary (totals) Ordering necessary for paging!
  • 29.
    Analytics & reporting ORM less relevant: no entities, but complex aggregations Simple sum/avg/counts possible over entities Specialized db calls for complex reports: Partitioning/windowing etc. Integrate using native query Or create a database view and map entity
  • 30.
  • 31.
    Large collections Frontend: Request CompanyGroup 1..* Backend: Company CompanyGroup 1..* Meta-data Company read-only entity, CompanyInGroup backed by expensive view *..1 Company
  • 32.
    Large collections Frontend: Request CompanyGroup 1..* Backend: Company CompanyGroup 1..* Meta-data CompanyInGroup *..1 Company
  • 33.
    Large collections Frontend: Request CompanyGroup 1..* Backend: Company CompanyGroup 1..* Meta-data CompanyInGroup *..1 Company
  • 34.
    Large collections Openinglarge groups sluggish Improved performance: Fetches many uninitialized collections in 1 query Also possible on entity: Request CompanyGroup 1..* Company
  • 35.
    Large collections Openinglarge groups sluggish Improved performance: Fetches many uninitialized collections in 1 query Also possible on entity: Request CompanyGroup 1..* Company Better solution in hindsight: fetch join
  • 36.
    Large collections Extralazy collection fetching Efficient: companies.size() -> count query companies.contains() -> select 1 where ... companies.get(n) -> select * where index = n
  • 37.
    Large collections Savinglarge group slow: >15 sec. Problem: Hibernate inserts row by row Query creation overhead, network latency Solution: <property name="hibernate.jdbc.batch_size">100 </property> Enables JDBC batched statements Caution: global property Request CompanyGroup Also: <property name="hibernate.order_inserts">true 1..* </property> Company <property name="hibernate.order_updates">true </property>
  • 38.
    Large collections Frontend: Request CompanyGroup 1..* Backend: Company CompanyGroup 1..* Meta-data CompanyInGroup *..1 Company
  • 39.
    Large collections Backend: CompanyGroup 1..* Meta-data CompanyInGroup *..1 Company
  • 40.
    Large collections Process CreateGroup (Soap) Business Service Service CreateGroup: ~10 min. for thousands of companies @BatchSize on Company improved demarshalling JDBC batch_size property marginal improvement INFO: INSERT INTO CompanyInGroup VALUES (?,...,?) INFO: SELECT @identity INFO: INSERT INTO CompanyInGroup VALUES (?,...,?) INFO: SELECT @identity .. 1000 times CompanyGroup 1..* Meta-data Insert/select interleaved: due to gen. id CompanyInGroup *..1 Company
  • 41.
    Large collections Process CreateGroup (Soap) Business Service Service Solution: generate id in app. (not always feasible) Running in ~3 minutes with batched inserts Next problem: heap usage spiking Use StatelessSession ✦ Bypass first-level cache ✦ No automatic dirty checking CompanyGroup ✦ Bypass Hibernate event model and interceptors 1..* Meta-data ✦ No cascading of operations CompanyInGroup ✦ Collections on entities are ignored *..1 Company
  • 42.
    Large collections Process CreateGroup (Soap) Business Service Service Solution: generate id in app. (not always feasible) Running in ~3 minutes with batched inserts Next problem: heap usage spiking Use StatelessSession CompanyGroup 1..* Meta-data CompanyInGroup *..1 Company
  • 43.
    Large collections Process CreateGroup (Soap) Business Service Service Now <1 min., everybody happy! CompanyGroup 1..* Meta-data CompanyInGroup *..1 Company
  • 44.
    Large collections Process CreateGroup (Soap) Business Service Service Now <1 min., everybody happy! Data loss detected! CompanyGroup 1..* Meta-data CompanyInGroup *..1 Company
  • 45.
    Large collections Process CreateGroup (Soap) Business Service Service Data loss detected! StatelessSession and JDBC batch_size bug HHH-4042: Closed, won’t fix : CompanyGroup 1..* Meta-data CompanyInGroup *..1 Company
  • 46.
  • 47.
    Dirty little secret validated(item) performs read-only queries select currentItem from Catalog where .. Dirty collection after select spendingLimit from User where .. each iteration insert into Item values (?, ?, ?) select currentItem from Catalog where .. select spendingLimit from User where .. Batching fails insert into Item values (?, ?, ?) Flushmode.AUTO Loops always suspect: relational, set-based thinking
  • 48.
    Dirty little secret validated(item) performs read-only queries Dirty collection after each iteration Batching fails Flushmode.AUTO Loops always suspect: relational, set-based thinking
  • 49.
    Query hints Speed upread-only service calls: Hibernate Query.setHint(): Also: never use 2nd level cache just ‘because we can’
  • 50.
    Query hints Speed upread-only service calls: Hibernate Query.setHint(): Also: never use 2nd level cache just ‘because we can’ @org.hibernate.annotations.Immutable
  • 51.
    Large updates Naive approach: Entitiesare not always necessary: Changes are not reflected in persistence context With optimistic concurrency: VERSIONED keyword
  • 52.
    Large updates Naive approach: Entitiesare not always necessary: Changes are not reflected in persistence context With optimistic concurrency: VERSIONED keyword Consider use of stored procedures
  • 53.
    Cherish your database Dataand schema outlive your application Good indexes make a world of difference Stored procedures etc. are not inherently evil Do not let Hibernate dictate your schema Befriend a DBA instead! There are other solutions (there I said it) MyBatis Squeryl (Scala)
  • 54.
    Thanks for listening! @Sander_Mak Join me later today: Elevate your webapps with Scala & Lift! 17:00 Room C branchandbound.net

Editor's Notes

  • #2 Sander Mak - lead developer Java - Info Support\nDutch accent\nHibernate experience, not committer who knows everything\n
  • #3 Q: who has heard / said / thought this?\nLeaky abstraction -&gt; not going to defend ORM, many advantages\n1) mapping problems -&gt; impedance mismatch\n2) performance problems -&gt; stop treating Hibernate as blackbox!!\n
  • #4 Tangle of 37\nRed = bad -&gt; cyclic dependency\nHibernate implementation complex, but battle-tested \nJPA tutorials rosy picture: Using Hibernate can be quite hard!\n
  • #5 Tangle of 37\nRed = bad -&gt; cyclic dependency\nHibernate implementation complex, but battle-tested \nJPA tutorials rosy picture: Using Hibernate can be quite hard!\n
  • #6 Tangle of 37\nRed = bad -&gt; cyclic dependency\nHibernate implementation complex, but battle-tested \nJPA tutorials rosy picture: Using Hibernate can be quite hard!\n
  • #7 Examples use Hibernate/JPA API interchangeably: start with JPA, you will Hibernate specifics\n\n
  • #8 \n
  • #9 Tuning performance is a bit like refactoring: don&amp;#x2019;t change the semantics, just the how.\n\nPreserving correctness: unit tests! However, the more you reach the edges of the DBMS, the easier you will hit an obscure bug in query optimizer, caching strategy etc.\n\n
  • #10 Know your RDBMS! Database independence is nice when porting is necessary, but focus on particular DB for production situation (document!), that is what counts!\n (once you get into nitty-gritty opt. details, you will have to know the RDBMS intimately)\nIndex, covering indexes, locking strategies, &amp;#x2018;vacuuming&amp;#x2019;/&amp;#x2018;transaction logs&amp;#x2019;/reset statistics\n
  • #11 Hardware vs. virtualized, real data volumes, simulate real workloads\n
  • #12 SQL Server mgmt studio\n
  • #13 Question hear most often: how to see parameter values\n
  • #14 \n
  • #15 Beware: you might retrieve your whole database in one go...\n\nCode example: will load Company eager, Auths. lazy\nExtra-lazy: discuss later with large collections\n
  • #16 First encounter with lazy loading :)\n\nExtended persistence contexts, OpenSessionInView pattern and other band-aids\n\n
  • #17 First encounter with lazy loading :)\n\nExtended persistence contexts, OpenSessionInView pattern and other band-aids\n\n
  • #18 First encounter with lazy loading :)\n\nExtended persistence contexts, OpenSessionInView pattern and other band-aids\n\n
  • #19 Eager vs. lazy is contract specifying WHEN relations are retrieved, not HOW. For the HOW you can define fetching strategies.\n\nAlso possible to define fetch join on Criteria queries\n\n\n\n\n
  • #20 Eager vs. lazy is contract specifying WHEN relations are retrieved, not HOW. For the HOW you can define fetching strategies.\n\nAlso possible to define fetch join on Criteria queries\n\n\n\n\n
  • #21 Eager vs. lazy is contract specifying WHEN relations are retrieved, not HOW. For the HOW you can define fetching strategies.\n\nAlso possible to define fetch join on Criteria queries\n\n\n\n\n
  • #22 Eager vs. lazy is contract specifying WHEN relations are retrieved, not HOW. For the HOW you can define fetching strategies.\n\nAlso possible to define fetch join on Criteria queries\n\n\n\n\n
  • #23 Eager vs. lazy is contract specifying WHEN relations are retrieved, not HOW. For the HOW you can define fetching strategies.\n\nAlso possible to define fetch join on Criteria queries\n\n\n\n\n
  • #24 Eager vs. lazy is contract specifying WHEN relations are retrieved, not HOW. For the HOW you can define fetching strategies.\n\nAlso possible to define fetch join on Criteria queries\n\n\n\n\n
  • #25 Lazy as default, tune eager loading in queries specifically for your usecase (DAO pattern not that bad after all)\n
  • #26 \n
  • #27 \n
  • #28 Zelfde overwegingen gelden voor reporting queries, zoveel mogelijk in de query oplossen en geen entities teruggeven als niet nodig\n
  • #29 \n
  • #30 Zelfde overwegingen gelden voor reporting queries, zoveel mogelijk in de query oplossen en geen entities teruggeven als niet nodig\n
  • #31 Hibernate is specifically not for bulk manipulation: use stored procs for that. But when is something bulk? Collections with thousands of elements routinely in OLTP applications.\n
  • #32 \n
  • #33 \n
  • #34 \n
  • #35 \n
  • #36 Example: 20 groups with uninitialized collections, access first collection: all are initialized with 1 query.\nMeasured: opening first group was slightly slower, general user experience better\n
  • #37 \n
  • #38 \n
  • #39 \n
  • #40 \n
  • #41 \n
  • #42 \n
  • #43 \n
  • #44 stateless session ideaal for fire-and-forget service calls, minder in user-facing applicatie waar consistent houden persistence context van belang is.\n
  • #45 stateless session ideaal for fire-and-forget service calls, minder in user-facing applicatie waar consistent houden persistence context van belang is.\n
  • #46 \n
  • #47 \n
  • #48 \n
  • #49 Only &amp;#x2018;full batches&amp;#x2019; were performed\n
  • #50 \n
  • #51 Each item is flushed automatically before doing the lookup queries, because the collection becomes dirty after adding element\n
  • #52 Each item is flushed automatically before doing the lookup queries, because the collection becomes dirty after adding element\n
  • #53 Hibernate does some optimizing for read-only entities:\nIt saves execution time by not dirty-checking simple properties or single-ended associations. \nIt saves memory by deleting database snapshot\n\ncache increases load on memory, possibly more GC pauses for app. if co-located with application\n
  • #54 Interesting: all instances of entity are evicted from second level cache with such a query, even if WHERE clause limits affected entities\nAlso, no events fired as Hibernate normally would do.\n
  • #55 Interesting: all instances of entity are evicted from second level cache with such a query, even if WHERE clause limits affected entities\nAlso, no events fired as Hibernate normally would do.\n
  • #56 \n
  • #57 \n