Marcel Mitran – STSM, Architect Java on System z
mmitran@ca.ibm.com
November 20th, 2012




IBM Java PackedObjects: An Overview
IBM Software Group: Java Technology Centre




                                                   © 2012 IBM Corporation
Important Disclaimers



THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES
ONLY.

WHILST EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION
CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED.

ALL PERFORMANCE DATA INCLUDED IN THIS PRESENTATION HAVE BEEN GATHERED IN A CONTROLLED
ENVIRONMENT. YOUR OWN TEST RESULTS MAY VARY BASED ON HARDWARE, SOFTWARE OR
INFRASTRUCTURE DIFFERENCES.

ALL DATA INCLUDED IN THIS PRESENTATION ARE MEANT TO BE USED ONLY AS A GUIDE.

IN ADDITION, THE INFORMATION CONTAINED IN THIS PRESENTATION IS BASED ON IBM’S CURRENT
PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM, WITHOUT NOTICE.

IBM AND ITS AFFILIATED COMPANIES SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF
THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.

NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF:

- CREATING ANY WARRANT OR REPRESENTATION FROM IBM, ITS AFFILIATED COMPANIES OR ITS OR
THEIR SUPPLIERS AND/OR LICENSORS



2                                                                               © 2012 IBM Corporation
PackedObject Delivery and Intended Use

PackedObject is an experimental feature in IBM J9 Virtual Machine.


Goal(s) of Feature:
■   Improve serialization and I/O of Java objects
■   Allow direct access to “native” (off-heap) data
■   Allow for explicit source-level representation of compact data-structure



Intended Use:
■   Provide an opportunity for feedback and experimentation
      – Not meant for production support
      – Not a committed language change




3                                                                              © 2012 IBM Corporation
PackedObjects for IBM's Java

        Features of today's Java work well in certain     ...changing how Java data is represented and
                                                        Present data is accessed and used introduces
                                                          native l
               scenarios, poorly in others...
                                                               new efficiencies into the Java language



    ●
      Bloated Objects: Data headers and                   ●
                                                              Shared Headers & No References
    references required to access and use data
    stored outside of Java

    ●
      No direct access to off-heap data: Java
    Native Interface or Direct Byte Buffers
                                                          ●
                                                              Direct access to native stored off-heap
    required when accessing.

    ●
     Redundant Data Copying: Copies of off-
    heap data required to incorporate/act-on              ●
                                                              Elimination of data copies
    changes to source data

    ●
      Suboptimal heap placement: Non-
    adjacent placement of objects in memory               ●
                                                              In-lined data allows for optimal caching
    slows down serialization, garbage collection

4                                                                                             © 2012 IBM Corporation
Speak to me in 'Java', I don't speak 'Native'
■   Java only speaks ‘Java’…
      – Data typically must be copied/(de)serialized/marshalled onto/off Java heap
      – Costly in path-length and footprint




5                                                                                    © 2012 IBM Corporation
On-Heap PackedObjects Example

■   Allows controlled layout of storage of data structures on the Java heap
      – Reduces footprint of data on Java heap
      – No (de)serialization required




    I/O
                 Native storage (20 bytes)



                  Java heap

                                                                               JVM




6                                                                             © 2012 IBM Corporation
Off-Heap Packed Objects Example
■   Enable Java to talk directly to the native data structure
     – Avoid overhead of data copy onto/off Java heap
     – No (de)serialization required




    I/O
                   Native storage (20 bytes)




                       Meta Data
                                                                  JVM




                     Java heap


7                                                               © 2012 IBM Corporation
Example: Distributed Computing High-Level Architecture


                                     Communication between nodes
                                       (RDMA, hyper-sockets, ORB, etc):          Using Java packed objects, data can
                                       ●   Data copy                                  be moved between the
                                       ●   (De)Serialization                          persistency and communication
                                                                                      layers without being copied or
                                                                                      (de)serialized onto/off the Java
    Data persistency on each                                                          heap
         node (DB, file-system,
         etc):
    ●    Data copy
    ●    (De)serialization         DB                                      DB




                                             JVM                                     JVM
                                    App.                                    App.
                                    Server                                  Server



                                  Node                                    Node


8                                                                                                       © 2012 IBM Corporation
© 2012 IBM Corporation


Page 9

 Example: Inter-language Communication
Java requires data
copies, marshalling and   COBOL     Java      C/C++
serialization across
language boundaries       foo(…){   goo(…){   loo(…){
                          …         …         …
                          goo();    loo();    }
                          }         }




Java packed objects
avoids data copies,       COBOL     Java      C/C++
marshaling and
serialization             foo(…){   goo(…){   loo(…){
                          …         …         …
                          goo();    loo();    }
                          }         }
PackedObjects 101
■    A new PackedObject type for the Java language, which allows for:
      – Direct access of data located outside of the Java heap
      – Contiguous allocation of all object's data (objects and arrays)
      – Is not derived from Object, and hence dis-allows assignment and casting
      – Special BoxingPackedObject is glue to reference a PackedObject from Object

                             java/lang/object                             java/lang/PackedObject




      java/math/BigDecimal                       etc…   java/lang/PackedArray                  etc…

                              java/lang/String                             java/lang/PackedString

        java/lang/HashMapEntry        java/lang/BoxedPackedObject   java/lang/PackedHashMapEntry




■    Current Java Capabilities
      – Current Java logic requires language interpreters and data copies for execution.
      – PackedObjects eliminate data copies across the Java Native Interface and the
        need to design and maintain Direct Byte Buffers
■    Using PackedObjects: annotation-based (or later a packed key word) above a class
     definition is required to create a packed class. The class instances can be accessed and
     modified identically to current Java objects
10                                                                                                 © 2012 IBM Corporation
Scope of Implementation
■    “@Packed” class annotation used to define a PackedObject class
■    “@Length” field annotation used to specify length of PackedObject arrays

Proposed Initial Rules
■    Packed types must directly subclass PackedObject
■    Packed inlining can only happen for field declarations which are primitives, PackedObjects or arrays of
     PackedObjects
■    Fields made up of arrays must provide a length that is a compile time constant
■    Regular Java primitive types cannot be used to declare a PackedObject array. Boxed types for
     primitive arrays must be used instead.
■    A field declaration cannot introduce a circular class dependency
■    When a PackedObject is instantiated, only the constructor for the top-level PackedObject is called
■    Local variable assignment and parameter passing of a PackedObject is copy-by-reference
■    BoxedPackedObject is used to box a PackedObject with an Object reference
■    Allocating a PackedObject using the 'new' keyword creates an on-heap PackedObject
■    Off-heap PackedObject creation is done using factory method provided in the class library


11                                                                                                © 2012 IBM Corporation
Code Snippets
■    Packed class definition      ■   On-Heap Packed Allocation




■    Off-heap Packed Allocation




12                                                                © 2012 IBM Corporation
Functionality Changes

                    Current Java                                    PackedObject

Data Field      ■   Object fields limited to primitives or      ■   When allocating a PackedObject, all
Allocation          references to other objects; non-               corresponding data fields get allocated
and Storage         primitives must be initialized and copied       simultaneously and packed into a single
                    into a format understood by Java.               contiguous object (rather than
                                                                    referenced).

                ■   Headers for child objects copied onto       ■   No headers for child objects which all
Child objects
                    the Java heap when accessed.                    share global header on the
                                                                    PackedObject.

 Arrays         ■   For arrays of objects each element in an    ■   Arrays packed together contiguously
                    array has it's own header and a                 under one common header; array length
                    reference to it. The elements are not           marked in PackedObject header. Full
                    contiguous in memory.                           access to elements in array and bounds
                                                                    checking still performed.

Off-heap        ■   Data can not be accessed or modified        ■   Data that does not exist in Java can be
                    outside of the Java heap. Data must be          accessed and modified directly by using
                    converted into a Java version and then          the data's memory location. The Java
                    this copy can then be accessed and              Virtual Machine takes care of the
                    manipulated.                                    accessors and modifiers internally.


13                                                                                               © 2012 IBM Corporation
Off Heap Benefit: Lowers Memory Footprint, increases performance
Before                                                          Native memory               ●
                                                                                             Java requires objects to be in primitive form to be
                              Header                                                        accessed directly*
                                                                                Header
                                                       Hea                                  ●
                                                                                             If objects are not in primitive form, references and
     Header                    Data                       der
                                                                                  Data      copies required to access data; time-consuming
                                                   Data                                     conversion process
       Data
                                reference                                                   ●
                                                                                              When objects are graphed onto the heap, they
              reference
                                                   reference
                                                                                reference
                                                                                            are placed randomly and occupy more space than
                                                                                            is needed
                                               He              Java Heap
                                                  ad
 HEADER




                                            Da     e                       er
                               Header         ta C r                   d                     Memory bloat occurs due to data copies (data
                                                                                            ●
               Data Copy                                           Hea
                                                  opy                     opy               must be accessed and copied, including headers)
                   Header                                              taC
                             Data Copy                            Da
                                                                                                                       *without the use of JNI or DBB




 After                                                                                      ●
                                                                                             PackedObjects eliminate requirement for objects to
                                                                                            be in primitive form
                                                                                            ●
                                                                                             PackedObjects can be accessed directly from source
                                                                                            without the redundant copying; no conversion
                                                                                            ●
                                                                                              PackedObject allocates and packs all data fields
     HEADER




                                                                                            (including other PackedObject and arrays) into a
                            Direct Access, No Copy
                                                                                            single well defined contiguous storage area


14                                                                                                                                 © 2012 IBM Corporation
Copyright and Trademarks



© IBM Corporation 2012. All Rights Reserved.


IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International
Business Machines Corp., and registered in many jurisdictions worldwide.


Other product and service names might be trademarks of IBM or other companies.


A current list of IBM trademarks is available on the Web – see the IBM “Copyright and
trademark information” page at URL: www.ibm.com/legal/copytrade.shtml




15                                                                                © 2012 IBM Corporation

IBM Java PackedObjects

  • 1.
    Marcel Mitran –STSM, Architect Java on System z [email protected] November 20th, 2012 IBM Java PackedObjects: An Overview IBM Software Group: Java Technology Centre © 2012 IBM Corporation
  • 2.
    Important Disclaimers THE INFORMATIONCONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. WHILST EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. ALL PERFORMANCE DATA INCLUDED IN THIS PRESENTATION HAVE BEEN GATHERED IN A CONTROLLED ENVIRONMENT. YOUR OWN TEST RESULTS MAY VARY BASED ON HARDWARE, SOFTWARE OR INFRASTRUCTURE DIFFERENCES. ALL DATA INCLUDED IN THIS PRESENTATION ARE MEANT TO BE USED ONLY AS A GUIDE. IN ADDITION, THE INFORMATION CONTAINED IN THIS PRESENTATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM, WITHOUT NOTICE. IBM AND ITS AFFILIATED COMPANIES SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF: - CREATING ANY WARRANT OR REPRESENTATION FROM IBM, ITS AFFILIATED COMPANIES OR ITS OR THEIR SUPPLIERS AND/OR LICENSORS 2 © 2012 IBM Corporation
  • 3.
    PackedObject Delivery andIntended Use PackedObject is an experimental feature in IBM J9 Virtual Machine. Goal(s) of Feature: ■ Improve serialization and I/O of Java objects ■ Allow direct access to “native” (off-heap) data ■ Allow for explicit source-level representation of compact data-structure Intended Use: ■ Provide an opportunity for feedback and experimentation – Not meant for production support – Not a committed language change 3 © 2012 IBM Corporation
  • 4.
    PackedObjects for IBM'sJava Features of today's Java work well in certain ...changing how Java data is represented and Present data is accessed and used introduces native l scenarios, poorly in others... new efficiencies into the Java language ● Bloated Objects: Data headers and ● Shared Headers & No References references required to access and use data stored outside of Java ● No direct access to off-heap data: Java Native Interface or Direct Byte Buffers ● Direct access to native stored off-heap required when accessing. ● Redundant Data Copying: Copies of off- heap data required to incorporate/act-on ● Elimination of data copies changes to source data ● Suboptimal heap placement: Non- adjacent placement of objects in memory ● In-lined data allows for optimal caching slows down serialization, garbage collection 4 © 2012 IBM Corporation
  • 5.
    Speak to mein 'Java', I don't speak 'Native' ■ Java only speaks ‘Java’… – Data typically must be copied/(de)serialized/marshalled onto/off Java heap – Costly in path-length and footprint 5 © 2012 IBM Corporation
  • 6.
    On-Heap PackedObjects Example ■ Allows controlled layout of storage of data structures on the Java heap – Reduces footprint of data on Java heap – No (de)serialization required I/O Native storage (20 bytes) Java heap JVM 6 © 2012 IBM Corporation
  • 7.
    Off-Heap Packed ObjectsExample ■ Enable Java to talk directly to the native data structure – Avoid overhead of data copy onto/off Java heap – No (de)serialization required I/O Native storage (20 bytes) Meta Data JVM Java heap 7 © 2012 IBM Corporation
  • 8.
    Example: Distributed ComputingHigh-Level Architecture Communication between nodes (RDMA, hyper-sockets, ORB, etc): Using Java packed objects, data can ● Data copy be moved between the ● (De)Serialization persistency and communication layers without being copied or (de)serialized onto/off the Java Data persistency on each heap node (DB, file-system, etc): ● Data copy ● (De)serialization DB DB JVM JVM App. App. Server Server Node Node 8 © 2012 IBM Corporation
  • 9.
    © 2012 IBMCorporation Page 9 Example: Inter-language Communication Java requires data copies, marshalling and COBOL Java C/C++ serialization across language boundaries foo(…){ goo(…){ loo(…){ … … … goo(); loo(); } } } Java packed objects avoids data copies, COBOL Java C/C++ marshaling and serialization foo(…){ goo(…){ loo(…){ … … … goo(); loo(); } } }
  • 10.
    PackedObjects 101 ■ A new PackedObject type for the Java language, which allows for: – Direct access of data located outside of the Java heap – Contiguous allocation of all object's data (objects and arrays) – Is not derived from Object, and hence dis-allows assignment and casting – Special BoxingPackedObject is glue to reference a PackedObject from Object java/lang/object java/lang/PackedObject java/math/BigDecimal etc… java/lang/PackedArray etc… java/lang/String java/lang/PackedString java/lang/HashMapEntry java/lang/BoxedPackedObject java/lang/PackedHashMapEntry ■ Current Java Capabilities – Current Java logic requires language interpreters and data copies for execution. – PackedObjects eliminate data copies across the Java Native Interface and the need to design and maintain Direct Byte Buffers ■ Using PackedObjects: annotation-based (or later a packed key word) above a class definition is required to create a packed class. The class instances can be accessed and modified identically to current Java objects 10 © 2012 IBM Corporation
  • 11.
    Scope of Implementation ■ “@Packed” class annotation used to define a PackedObject class ■ “@Length” field annotation used to specify length of PackedObject arrays Proposed Initial Rules ■ Packed types must directly subclass PackedObject ■ Packed inlining can only happen for field declarations which are primitives, PackedObjects or arrays of PackedObjects ■ Fields made up of arrays must provide a length that is a compile time constant ■ Regular Java primitive types cannot be used to declare a PackedObject array. Boxed types for primitive arrays must be used instead. ■ A field declaration cannot introduce a circular class dependency ■ When a PackedObject is instantiated, only the constructor for the top-level PackedObject is called ■ Local variable assignment and parameter passing of a PackedObject is copy-by-reference ■ BoxedPackedObject is used to box a PackedObject with an Object reference ■ Allocating a PackedObject using the 'new' keyword creates an on-heap PackedObject ■ Off-heap PackedObject creation is done using factory method provided in the class library 11 © 2012 IBM Corporation
  • 12.
    Code Snippets ■ Packed class definition ■ On-Heap Packed Allocation ■ Off-heap Packed Allocation 12 © 2012 IBM Corporation
  • 13.
    Functionality Changes Current Java PackedObject Data Field ■ Object fields limited to primitives or ■ When allocating a PackedObject, all Allocation references to other objects; non- corresponding data fields get allocated and Storage primitives must be initialized and copied simultaneously and packed into a single into a format understood by Java. contiguous object (rather than referenced). ■ Headers for child objects copied onto ■ No headers for child objects which all Child objects the Java heap when accessed. share global header on the PackedObject. Arrays ■ For arrays of objects each element in an ■ Arrays packed together contiguously array has it's own header and a under one common header; array length reference to it. The elements are not marked in PackedObject header. Full contiguous in memory. access to elements in array and bounds checking still performed. Off-heap ■ Data can not be accessed or modified ■ Data that does not exist in Java can be outside of the Java heap. Data must be accessed and modified directly by using converted into a Java version and then the data's memory location. The Java this copy can then be accessed and Virtual Machine takes care of the manipulated. accessors and modifiers internally. 13 © 2012 IBM Corporation
  • 14.
    Off Heap Benefit:Lowers Memory Footprint, increases performance Before Native memory ● Java requires objects to be in primitive form to be Header accessed directly* Header Hea ● If objects are not in primitive form, references and Header Data der Data copies required to access data; time-consuming Data conversion process Data reference ● When objects are graphed onto the heap, they reference reference reference are placed randomly and occupy more space than is needed He Java Heap ad HEADER Da e er Header ta C r d Memory bloat occurs due to data copies (data ● Data Copy Hea opy opy must be accessed and copied, including headers) Header taC Data Copy Da *without the use of JNI or DBB After ● PackedObjects eliminate requirement for objects to be in primitive form ● PackedObjects can be accessed directly from source without the redundant copying; no conversion ● PackedObject allocates and packs all data fields HEADER (including other PackedObject and arrays) into a Direct Access, No Copy single well defined contiguous storage area 14 © 2012 IBM Corporation
  • 15.
    Copyright and Trademarks ©IBM Corporation 2012. All Rights Reserved. IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corp., and registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web – see the IBM “Copyright and trademark information” page at URL: www.ibm.com/legal/copytrade.shtml 15 © 2012 IBM Corporation