Provided by: perl-doc_5.40.0-8_all bug

NAME

       perlclassguts - Internals of how "feature 'class'" and class syntax works

DESCRIPTION

       This document provides in-depth information about the way in which the perl interpreter
       implements the "feature 'class'" syntax and overall behaviour.  It is not intended as an
       end-user guide on how to use the feature. For that, see perlclass.

       The reader is assumed to be generally familiar with the perl interpreter internals
       overall. For a more general overview of these details, see also perlguts.

DATA STORAGE

   Classes
       A class is fundamentally a package, and exists in the symbol table as an HV with an aux
       structure in exactly the same way as a non-class package. It is distinguished from a non-
       class package by the fact that the HvSTASH_IS_CLASS() macro will return true on it.

       Extra information relating to it being a class is stored in the "struct xpvhv_aux"
       structure attached to the stash, in the following fields:

           HV          *xhv_class_superclass;
           CV          *xhv_class_initfields_cv;
           AV          *xhv_class_adjust_blocks;
           PADNAMELIST *xhv_class_fields;
           PADOFFSET    xhv_class_next_fieldix;
           HV          *xhv_class_param_map;

       •   "xhv_class_superclass" will be "NULL" for a class with no superclass. It will point
           directly to the stash of the parent class if one has been set with the :isa() class
           attribute.

       •   "xhv_class_initfields_cv" will contain a "CV *" pointing to a function to be invoked
           as part of the constructor of this class or any subclass thereof. This CV is
           responsible for initializing all the fields defined by this class for a new instance.
           This CV will be an anonymous real function - i.e. while it has no name and no GV, it
           is not a protosub and may be directly invoked.

       •   "xhv_class_adjust_blocks" may point to an AV containing CV pointers to each of the
           "ADJUST" blocks defined on the class. If the class has a superclass, this array will
           additionally contain duplicate pointers of the CVs of its parent class. The AV is
           created lazily the first time an element is pushed to it; it is valid for there not to
           be one, and this pointer will be "NULL" in that case.

           The CVs are stored directly, not via RVs. Each CV will be an anonymous real function.

       •   "xhv_class_fields" will point to a "PADNAMELIST" containing "PADNAME"s, each being one
           defined field of the class. They are stored in order of declaration. Note however,
           that the index into this array will not necessarily be equal to the "fieldix" of each
           field, because in the case of a subclass, the array will begin at zero but the index
           of the first field in it will be non-zero if its parent class contains any fields at
           all.

           For more information on how individual fields are represented, see "Fields".

       •   "xhv_class_next_fieldix" gives the field index that will be assigned to the next field
           to be added to the class. It is only useful at compile-time.

       •   "xhv_class_param_map" may point to an HV which maps field ":param" attribute names to
           the field index of the field with that name. This mapping is copied from parent
           classes; each class will contain the sum total of all its parents in addition to its
           own.

   Fields
       A field is still fundamentally a lexical variable declared in a scope, and exists in the
       "PADNAMELIST" of its corresponding CV. Methods and other method-like CVs can still capture
       them exactly as they can with regular lexicals. A field is distinguished from other kinds
       of pad entry in that the PadnameIsFIELD() macro will return true on it.

       Extra information relating to it being a field is stored in an additional structure
       accessible via the PadnameFIELDINFO() macro on the padname. This structure has the
       following fields:

           PADOFFSET  fieldix;
           HV        *fieldstash;
           OP        *defop;
           SV        *paramname;
           bool       def_if_undef;
           bool       def_if_false;

       •   "fieldix" stores the "field index" of the field; that is, the index into the instance
           field array where this field's value will be stored. Note that the first index in the
           array is not specially reserved. The first field in a class will start from field
           index 0.

       •   "fieldstash" stores a pointer to the stash of the class that defined this field. This
           is necessary in case there are multiple classes defined within the same scope; it is
           used to disambiguate the fields of each.

               {
                   class C1; field $x;
                   class C2; field $x;
               }

       •   "defop" may store a pointer to a defaulting expression optree for this field.
           Defaulting expressions are optional; this field may be "NULL".

       •   "paramname" may point to a regular string SV containing the ":param" name attribute
           given to the field. If none, it will be "NULL".

       •   One of "def_if_undef" and "def_if_false" will be true if the defaulting expression was
           set using the "//=" or "||=" operators respectively.

   Methods
       A method is still fundamentally a CV, and has the same basic representation as one. It has
       an optree and a pad, and is stored via a GV in the stash of its containing package. It is
       distinguished from a non-method CV by the fact that the CvIsMETHOD() macro will return
       true on it.

       (Note: This macro should not be confused with the one that was previously called
       CvMETHOD(). That one does not relate to the class system, and was renamed to
       CvNOWARN_AMBIGUOUS() to avoid this confusion.)

       There is currently no extra information that needs to be stored about a method CV, so the
       structure does not add any new fields.

   Instances
       Object instances are represented by an entirely new SV type, whose base type is
       "SVt_PVOBJ". This should still be blessed into its class stash and wrapped in an RV in the
       usual manner for classical object.

       As these are their own unique container type, distinct from hashes or arrays, the core
       "builtin::reftype" function returns a new value when asked about these. That value is
       "OBJECT".

       Internally, such an object is an array of SV pointers whose size is fixed at creation time
       (because the number of fields in a class is known after compilation). An object instance
       stores the max field index within it (for basic error-checking on access), and a fixed-
       size array of SV pointers storing the individual field values.

       Fields of array and hash type directly store AV or HV pointers into the array; they are
       not stored via an intervening RV.

API

       The data structures described above are supported by the following API functions.

   Class Manipulation
       class_setup_stash

           void class_setup_stash(HV *stash);

       Called by the parser on encountering the "class" keyword. It upgrades the stash into being
       a class and prepares it for receiving class-specific items like methods and fields.

       class_seal_stash

           void class_seal_stash(HV *stash);

       Called by the parser at the end of a "class" block, or for unit classes its containing
       scope. This function performs various finalisation activities that are required before
       instances of the class can be constructed, but could not have been done until all the
       information about the members of the class is known.

       Any additions to or modifications of the class under compilation must be performed between
       these two function calls. Classes cannot be modified once they have been sealed.

       class_add_field

           void class_add_field(HV *stash, PADNAME *pn);

       Called by pad.c as part of defining a new field name in the current pad.  Note that this
       function does not create the padname; that must already be done by pad.c. This API
       function simply informs the class that the new field name has been created and is now
       available for it.

       class_add_ADJUST

           void class_add_ADJUST(HV *stash, CV *cv);

       Called by the parser once it has parsed and constructed a CV for a new "ADJUST" block.
       This gets added to the list stored by the class.

   Field Manipulation
       class_prepare_initfield_parse

           void class_prepare_initfield_parse();

       Called by the parser just before parsing an initializing expression for a field variable.
       This makes use of a suspended compcv to combine all the field initializing expressions
       into the same CV.

       class_set_field_defop

           void class_set_field_defop(PADNAME *pn, OPCODE defmode, OP *defop);

       Called by the parser after it has parsed an initializing expression for the field. Sets
       the defaulting expression and mode of application. "defmode" should either be zero, or one
       of "OP_ORASSIGN" or "OP_DORASSIGN" depending on the defaulting mode.

       padadd_FIELD

           #define padadd_FIELD

       This flag constant tells the "pad_add_name_*" family of functions that the new name should
       be added as a field. There is no need to call class_add_field(); this will be done
       automatically.

   Method Manipulation
       class_prepare_method_parse

           void class_prepare_method_parse(CV *cv);

       Called by the parser after start_subparse() but immediately before doing anything else.
       This prepares the "PL_compcv" for parsing a method; arranging for the "CvIsMETHOD" test to
       be true, adding the $self lexical, and any other activities that may be required.

       class_wrap_method_body

           OP *class_wrap_method_body(OP *o);

       Called by the parser at the end of parsing a method body into an optree but just before
       wrapping it in the eventual CV. This function inserts extra ops into the optree to make
       the method work correctly.

   Object Instances
       SVt_PVOBJ

           #define SVt_PVOBJ

       An SV type constant used for comparison with the SvTYPE() macro.

       ObjectMAXFIELD

           SSize_t ObjectMAXFIELD(sv);

       A function-like macro that obtains the maximum valid field index that can be accessed from
       the "ObjectFIELDS" array.

       ObjectFIELDS

           SV **ObjectFIELDS(sv);

       A function-like macro that obtains the fields array directly out of an object instance.
       Fields can be accessed by their field index, from 0 up to the maximum valid index given by
       "ObjectMAXFIELD".

OPCODES

   OP_METHSTART
           newUNOP_AUX(OP_METHSTART, ...);

       An "OP_METHSTART" is an "UNOP_AUX" which must be present at the start of a method CV in
       order to make it work properly. This is inserted by class_wrap_method_body(), and even
       appears before any optree fragment associated with signature argument checking or
       extraction.

       This op is responsible for shifting the value of $self out of the arguments list and
       binding any field variables that the method requires access to into the pad. The AUX
       vector will contain details of the field/pad index pairings required.

       This op also performs sanity checking on the invocant value. It checks that it is
       definitely an object reference of a compatible class type. If not, an exception is thrown.

       If the "op_private" field includes the "OPpINITFIELDS" flag, this indicates that the op
       begins the special "xhv_class_initfields_cv" CV. In this case it should additionally take
       the second value from the arguments list, which should be a plain HV pointer (directly,
       not via RV). and bind it to the second pad slot, where the generated optree will expect to
       find it.

   OP_INITFIELD
       An "OP_INITFIELD" is only invoked as part of the "xhv_class_initfields_cv" CV during the
       construction phase of an instance. This is the time that the individual SVs that make up
       the mutable fields of the instance (including AVs and HVs) are actually assigned into the
       "ObjectFIELDS" array. The "OPpINITFIELD_AV" and "OPpINITFIELD_HV" private flags indicate
       whether it is creating an AV or HV; if neither is set then an SV is created.

       If the op has the "OPf_STACKED" flag it expects to find an initializing value on the
       stack. For SVs this is the topmost SV on the data stack. For AVs and HVs it expects a
       marked list.

COMPILE-TIME BEHAVIOUR

   "ADJUST" Phasers
       During compiletime, parsing of an "ADJUST" phaser is handled in a fundamentally different
       way to the existing perl phasers ("BEGIN", etc...)

       Rather than taking the usual route, the tokenizer recognises that the "ADJUST" keyword
       introduces a phaser block. The parser then parses the body of this block similarly to how
       it would parse an (anonymous) method body, creating a CV that has no name GV. This is then
       inserted directly into the class information by calling "class_add_ADJUST", entirely
       bypassing the symbol table.

   Attributes
       During compilation, attributes of both classes and fields are handled in a different way
       to existing perl attributes on subroutines and lexical variables.

       The parser still forms an "OP_LIST" optree of "OP_CONST" nodes, but these are passed to
       the "class_apply_attributes" or "class_apply_field_attributes" functions. Rather than
       using a class lookup for a method in the class being parsed, a fixed internal list of
       known attributes is used to find functions to apply the attribute to the class or field.
       In future this may support user-supplied extension attribute, though at present it only
       recognises ones defined by the core itself.

   Field Initializing Expressions
       During compilation, the parser makes use of a suspended compcv when parsing the defaulting
       expression for a field. All the expressions for all the fields in the class share the same
       suspended compcv, which is then compiled up into the same internal CV called by the
       constructor to initialize all the fields provided by that class.

RUNTIME BEHAVIOUR

   Constructor
       The generated constructor for a class itself is an XSUB which performs three tasks in
       order: it creates the instance SV itself, invokes the field initializers, then invokes the
       ADJUST block CVs. The constructor for any class is always the same basic shape, regardless
       of whether the class has a superclass or not.

       The field initializers are collected into a generated optree-based CV called the field
       initializer CV. This is the CV which contains all the optree fragments for the field
       initializing expressions. When invoked, the field initializer CV might make a chained call
       to the superclass initializer if one exists, before invoking all of the individual field
       initialization ops. The field initializer CV is invoked with two items on the stack; being
       the instance SV and a direct HV containing the constructor parameters. Note carefully:
       this HV is passed directly, not via an RV reference. This is permitted because both the
       caller and the callee are directly generated code and not arbitrary pure-perl subroutines.

       The ADJUST block CVs are all collected into a single flat list, merging all of the ones
       defined by the superclass as well. They are all invoked in order, after the field
       initializer CV.

   $self Access During Methods
       When class_prepare_method_parse() is called, it arranges that the pad of the new CV body
       will begin with a lexical called $self. Because the pad should be freshly-created at this
       point, this will have the pad index of 1.  The function checks this and aborts if that is
       not true.

       Because of this fact, code within the body of a method or method-like CV can reliably use
       pad index 1 to obtain the invocant reference. The "OP_INITFIELD" opcode also relies on
       this fact.

       In similar fashion, during the "xhv_class_initfields_cv" the next pad slot is relied on to
       store the constructor parameters HV, at pad index 2.

AUTHORS

       Paul Evans