Ubuntu Manpage: UR::Context - Manage the current state of the application

NAME

       UR::Context - Manage the current state of the application

SYNOPSIS

         use AppNamespace;

         my $obj = AppNamespace::SomeClass->get(id => 1234);
         $obj->some_property('I am changed');

         UR::Context->get_current->rollback; # some_property reverts to its original value

         $obj->other_property('Now, I am changed');

         UR::Context->commit; # other_property now permanently has that value

DESCRIPTION

       The main application code will rarely interact with UR::Context objects directly, except for the "commit"
       and "rollback" methods.  It manages the mappings between an application's classes, object cache, and
       external data sources.

SUBCLASSES

       UR::Context is an abstract class.  When an application starts up, the system creates a handful of
       Contexts that logically exist within one another:

       1. UR::Context::Root - A context to represent all the data reachable in the application's namespace.  It
       connects the application to external data sources.
       2. UR::Context::Process - A context to represent the state of data within the currently running
       application.  It handles the transfer of data to and from the Root context, through the object cache, on
       behalf of the application code.
       3. UR::Context::Transaction - A context to represent an in-memory transaction as a diff of the object
       cache.  The Transaction keeps a list of changes to objects and is able to revert those changes with
       "rollback()", or apply them to the underlying context with "commit()".

CONSTRUCTOR

       begin
             my $trans = UR::Context::Transaction->begin();

           UR::Context::Transaction instances are created through "begin()".

       A  UR::Context::Root  and  UR::Context::Process  context  will  be  created  for you when the application
       initializes.  Additional instances of these classes are not usually instantiated.

METHODS

       Most of the methods below can be called as either a class or object method of UR::Context.  If called  as
       a class method, they will operate on the current context.

       get_current
             my $context = UR::Context::get_current();

           Returns the UR::Context instance of whatever is the most currently created Context.  Can be called as
           a class or object method.

       query_underlying_context
             my $should_load = $context->query_underlying_context();
             $context->query_underlying_context(1);

           A   property   of  the  Context  that  sets  the  default  value  of  the  $should_load  flag  inside
           "get_objects_for_class_and_rule" as described below.  Initially, its value  is  undef,  meaning  that
           during  a  get(),  the Context will query the underlying data sources only if this query has not been
           done before.  Setting this property to 0 will make the Context never query data sources, meaning that
           the only objects retrievable are those already in memory.  Setting the property to 1 means that every
           query will hit the data sources, even if the query has been done before.

       get_objects_for_class_and_rule
             @objs = $context->get_objects_for_class_and_rule(
                                   $class_name,
                                   $boolexpr,
                                   $should_load,
                                   $should_return_iterator
                               );

           This is the method that serves as the main entry  point  to  the  Context  behind  the  "get()",  and
           "is_loaded()" methods of UR::Object, and "reload()" method of UR::Context.

           $class_name  and  $boolexpr are required arguments, and specify the target class by name and the rule
           used to filter the objects the caller is interested in.

           $should_load is a flag indicating whether the Context should load objects satisfying  the  rule  from
           external  data  sources.   A true value means it should always ask the relevent data sources, even if
           the Context believes the requested data is in the object cache,  A false but defined value means  the
           Context  should not ask the data sources for new data, but only return what is currently in the cache
           matching  the  rule.   The  value  "undef"  means  the  Context  should  use   the   value   of   its
           query_underlying_context  property.   If that is also undef, then it will use its own judgement about
           asking the data sources for new data, and will merge cached and external data as necessary to fulfill
           the request.

           $should_return_iterator is a flag indicating whether this method should return the  objects  directly
           as  a  list, or iterator function instead.  If true, it returns a subref that returns one object each
           time it is called, and undef after the last matching object:

             my $iter = $context->get_objects_for_class_and_rule(
                                      'MyClass',
                                      $rule,
                                      undef,
                                      1
                                  );
             my @objs;
             while (my $obj = $iter->());
                 push @objs, $obj;
             }

       has_changes
             my $bool = $context->has_changes();

           Returns true if any objects in the given Context's object cache (or the current Context if called  as
           a class method) have any changes that haven't been saved to the underlying context.

       commit
             UR::Context->commit();

           Causes all objects with changes to save their changes back to the underlying context.  If the current
           context  is  a  UR::Context::Transaction,  then  the  changes will be applied to whatever Context the
           transaction is a part of.  if the current context is a UR::Context::Process context, then  "commit()"
           pushes  the  changes  to the underlying UR::Context::Root context, meaning that those changes will be
           applied to the relevent data sources.

           In the usual case, where no transactions are in play  and  all  data  sources  are  RDBMS  databases,
           calling  "commit()"  will  cause  the  program  to  begin issuing SQL against the databases to update
           changed objects, insert rows for newly created objects, and delete rows from deleted objects as  part
           of  an SQL transaction.  If all the changes apply cleanly, it will do and SQL "commit", or "rollback"
           if not.

           commit() returns true if all the changes have been safely  transferred  to  the  underlying  context,
           false if there were problems.

       rollback
             UR::Context->rollback();

           Causes  all objects' changes for the current transaction to be reversed.  If the current context is a
           UR::Context::Transaction, then the transactional properties of those objects will be reverted to  the
           values  they  had  when the transaction started.  Outside of a transaction, object properties will be
           reverted to their values when they were loaded from the underlying data source.  rollback() will also
           ask all the underlying databases to rollback.

       clear_cache
             UR::Context->clear_cache();

           Asks the current context to remove all non-infrastructional data from its object cache.  This  method
           will fail and return false if any object has changes.

       resolve_data_source_for_object
             my $ds = $obj->resolve_data_source_for_object();

           For the given $obj object, return the UR::DataSource instance that object was loaded from or would be
           saved to.  If objects of that class do not have a data source, then it will return "undef".

       resolve_data_sources_for_class_meta_and_rule
             my @ds = $context->resolve_data_sources_for_class_meta_and_rule($class_obj, $boolexpr);

           For  the  given  class metaobject and boolean expression (rule), return the list of data sources that
           will need to be queried in order to return the objects matching the rule.  In most  cases,  only  one
           data source will be returned.

       infer_property_value_from_rule
             my $value = $context->infer_property_value_from_rule($property_name, $boolexpr);

           For  the  given  boolean  expression  (rule), and a property name not mentioned in the rule, but is a
           property of the class the rule is against, return the value that property must logically have.

           For example, if this object is the only TestClass object where "foo" is the value 'bar', it can infer
           that the TestClass property "baz" must have the value 'blah' in the current context.

             my $obj = TestClass->create(id => 1, foo => 'bar', baz=> 'blah');
             my $rule = UR::BoolExpr->resolve('TestClass', foo => 'bar);
             my $val = $context->infer_property_value_from_rule('baz', $rule);
             # val now is 'blah'

       object_cache_size_highwater
             UR::Context->object_cache_size_highwater(5000);
             my $highwater = UR::Context->object_cache_size_highwater();

           Set or get the value for the Context's object cache pruning high water mark.  The object cache pruner
           will be run during the next "get()" if the cache contains more than this number of prunable  objects.
           See the "Object Cache Pruner" section below for more information.

       object_cache_size_lowwater
             UR::Context->object_cache_size_lowwater(5000);
             my $lowwater = UR::Context->object_cache_size_lowwater();

           Set or get the value for the Context's object cache pruning high water mark.  The object cache pruner
           will stop when the number of prunable objects falls below this number.

       prune_object_cache
             UR::Context->prune_object_cache();

           Manually run the object cache pruner.

       reload
             UR::Context->reload($object);
             UR::Context->reload('Some::Class', 'property_name', value);

           Ask  the  context  to load an object's data from an underlying Context, even if the object is already
           cached.  With a single parameter, it will use that object's ID parameters as the basis  for  querying
           the data source.  "reload" will also accept a class name and list of key/value parameters the same as
           "get".

       _light_cache
             UR::Context->_light_cache(1);

           Turn on or off the light caching flag.  Light caching alters the behavior of the object cache in that
           all  object  references  in  the  cache are made weak by Scalar::Util::weaken().  This means that the
           application code must keep hold of any object references it  wants  to  keep  alive.   Light  caching
           defaults to being off, and must be explicitly turned on with this method.

Custom observer aspects

       UR::Context sends signals for observers watching for some non-standard aspects.

       precommit
         After  "commit()"  has  been  called,  but  before any changes are saved to the data sources.  The only
         parameters to the Observer's callback are the Context object and the aspect ("precommit").

       commit
         After "commit()" has been called, and after an attempt has been made to save the changes  to  the  data
         sources.   The  parameters to the callback are the Context object, the aspect ("commit"), and a boolean
         value indicating whether the commit succeeded or not.

       prerollback
         After "rollback()" has been called, but before and object state is reverted.

       rollback
         After "rollback()" has been called, and after an attempt has been made to revert the state of  all  the
         loaded  objects.  The parameters to the callback are the Context object, the aspect ("rollback"), and a
         boolean value indicating whether the rollback succeeded or not.

Data Concurrency

Currently, the Context is optimistic about data concurrency, meaning that it does very little to prevent
clobbering data in underlying Contexts during a commit() if other processes have changed an object's data
after the Context has cached and object. For example, a database has an object with ID 1 and a property
with value 'bob'. A program loads this object and changes the property to 'fred', but does not yet
commit(). Meanwhile, another program loads the same object, changes the value to 'joe' and does
commit(). Finally the first program calls commit(). The final value in the database will be 'fred', and
no exceptions will be raised.

As part of the caching behavior, the Context keeps a record of what the object's state is as it's loaded
from the underlying Context. This is how the Context knows what object have been changed during
"commit()".

If an already cached object's data is reloaded as part of some other query, data consistency of each
property will be checked. If there are no conflicting changes, then any differences between the object's
initial state and the current state in the underlying Context will be applied to the object's notion of
what it thinks its initial state is.

In some future release, UR may support additional data concurrency methods such as pessimistic
concurrency: check that the current state of all changed (or even all cached) objects in the underlying
Context matches the initial state before committing changes downstream. Or allowing the object cache to
operate in write-through mode for some or all classes.

Internal Methods

There are many methods in UR::Context meant to be used internally, but are worth documenting for anyone
interested in the inner workings of the Context code.

_create_import_iterator_for_underlying_context
$subref = $context->_create_import_iterator_for_underlying_context(
$boolexpr, $data_source, $serial_number
);
$next_obj = $subref->();

This method is part of the object loading process, and is called by "get_objects_for_class_and_rule"
when it is determined that the requested data does not exist in the object cache, and data should be
brought in from another, underlying Context. Usually this means the data will be loaded from an
external data source.

$boolexpr is the UR::BoolExpr rule, usually from the application code.

$data_source is the UR::DataSource that will be used to load data from.

$serial_number is used by the object cache pruner. Each object loaded through this iterator will
have $serial_number in its "__get_serial" hashref key.

It works by first getting an iterator for the data source (the $db_iterator). It calls
"_resolve_query_plan_for_ds_and_bxt" to find out how data is to be loaded and whether this request
spans multiple data sources. It calls "__create_object_fabricator_for_loading_template" to get a
list of closures to transform the primary data source's data into UR objects, and
"_create_secondary_loading_closures" (if necessary) to get more closures that can load and join data
from the primary to the secondary data source(s).

It returns a subref that works as an iterator, loading and returning objects one at a time from the
underlying context into the current context. It returns undef when there are no more objects to
return.

The returned iterator works by first asking the $db_iterator for the next row of data as a listref.
Asks the secondary data source joiners whether there is any matching data. Calls the object
fabricator closures to convert the data source data into UR objects. If any of the object requires
subclassing, then additional importing iterators are created to handle that. Finally, the objects
matching the rule are returned to the caller one at a time.

_resolve_query_plan_for_ds_and_bxt
my $query_plan = $context->_resolve_query_plan_for_ds_and_bxt(
$data_source,
$boolexpr_tmpl
);
my($query_plan, @addl_info) = $context->_resolve_query_plan_for_ds_and_bxt(
$data_source,
$boolexpr_tmpl
);

When a request is made that will hit one or more data sources, "_resolve_query_plan_for_ds_and_bxt"
is used to call a method of the same name on the data source. It retuns a hashref used by many other
parts of the object loading system, and describes what data source to use, how to query that data
source to get the objects, how to use the raw data returned by the data source to construct objects
and how to resolve any delegated properties that are a part of the rule.

$data_source is a UR::DataSource object ID. $coolexpr_tmpl is a UR::BoolExpr::Template object.

In the common case, the query will only use one data source, and this method returns that data
directly. But if the primary data source sets the "joins_across_data_sources" key on the data
structure as may be the case when a rule involves a delegated property to a class that uses a
different data source, then this methods returns an additional list of data. For each additional
data source needed to resolve the query, this list will have three items:

1.
The secondary data source ID

2.
A listref of delegated UR::Object::Property objects joining the primary data source to this
secondary data source.

3.
A UR::BoolExpr::Template rule template applicable against the secondary data source

_create_secondary_rule_from_primary
my $new_rule = $context->_create_secondary_rule_from_primary(
$primary_rule,
$delegated_properties,
$secondary_rule_tmpl
);

When resolving a request that requires multiple data sources, this method is used to construct a rule
against applicable against the secondary data source. $primary_rule is the UR::BoolExpr rule used in
the original query. $delegated_properties is a listref of UR::Object::Property objects as returned
by "_resolve_query_plan_for_ds_and_bxt()" linking the primary to the secondary data source.
$secondary_rule_tmpl is the rule template, also as returned by
"_resolve_query_plan_for_ds_and_bxt()".

_create_secondary_loading_closures
my($obj_importers, $joiners) = $context->_create_secondary_loading_closures(
$primary_rule_tmpl,
@addl_info);

When reolving a request that spans multiple data sources, this method is used to construct two lists
of subrefs to aid in the request. $primary_rule_tmpl is the UR::BoolExpr::Template rule template
made from the original rule. @addl_info is the same list returned by
"_resolve_query_plan_for_ds_and_bxt". For each secondary data source, there will be one item in the
two listrefs that are returned, and in the same order.

$obj_importers is a listref of subrefs used as object importers. They transform the raw data
returned by the data sources into UR objects.

$joiners is also a listref of subrefs. These closures know how the properties link the primary data
source data to the secondary data source. They take the raw data from the primary data source, load
the next row of data from the secondary data source, and returns the secondary data that successfully
joins to the primary data. You can think of these closures as performing the same work as an SQL
"join" between data in different data sources.

_cache_is_complete_for_class_and_normalized_rule
($is_cache_complete, $objects_listref) =
$context->_cache_is_complete_for_class_and_normalized_rule(
$class_name, $boolexpr
);

This method is part of the object loading process, and is called by "get_objects_for_class_and_rule"
to determine if the objects requested by the UR::BoolExpr $boolexpr will be found entirely in the
object cache. If the answer is yes then $is_cache_complete will be true. $objects_listef may or may
not contain objects matching the rule from the cache. If that list is not returned, then
"get_objects_for_class_and_rule" does additional work to locate the matching objects itself via
"_get_objects_for_class_and_rule_from_cache"

It does its magic by looking at the $boolexpr and loosely matching it against the query cache
$UR::Context::all_params_loaded

_get_objects_for_class_and_rule_from_cache
@objects = $context->_get_objects_for_class_and_rule_from_cache(
$class_name, $boolexpr
);

This method is called by "get_objects_for_class_and_rule" when
_cache_is_complete_for_class_and_normalized_rule says the requested objects do exist in the cache,
but did not return those items directly.

The UR::BoolExpr $boolexpr contains hints about how the matching data is likely to be found. Its
"_context_query_strategy" key will contain one of three values

1. all
This rule is against a class with no filters, meaning it should return every member of that class.
It calls "$class->all_objects_loaded" to extract all objects of that class in the object cache.

2. id
This rule is against a class and filters by only a single ID, or a list of IDs. The request is
fulfilled by plucking the matching objects right out of the object cache.

3. index
This rule is against one more more non-id properties. An index is built mapping the filtered
properties and their values, and the cached objects which have those values. The request is
fulfilled by using the index to find objects matching the filter.

4. set intersection
This is a group-by rule and will return a ::Set object.

_loading_was_done_before_with_a_superset_of_this_params_hashref
$bool = $context->_loading_was_done_before_with_a_superset_of_this_params_hashref(
$class_name,
$params_hashref
);

This method is used by "_cache_is_complete_for_class_and_normalized_rule" to determine if the
requested data was asked for previously, either from a get() asking for a superset of the current
request, or from a request on a parent class of the current request.

For example, if a get() is done on a class with one param:

@objs = ParentClass->get(param_1 => 'foo');

And then later, another request is done with an additional param:

@objs2 = ParentClass->get(param_1 => 'foo', param_2 => 'bar');

Then the first request must have returned all the data that could have possibly satisfied the second
request, and so the system will not issue a query against the data source.

As another example, given those two previously done queries, if another get() is done on a class that
inherits from ParentClass

@objs3 = ChildClass->get(param_1 => 'foo');

again, the first request has already loaded all the relevent data, and therefore won't query the data
source.

_sync_databases
$bool = $context->_sync_databases();

Starts the process of committing all the Context's changes to the external data sources.
_sync_databases() is the workhorse behind "commit".

First, it finds all objects with changes. Checks those changed objects for validity with
"$obj->invalid". If any objects are found invalid, then _sync_databases() will fail. Finally, it
bins all the changed objects by data source, and asks each data source to save those objects'
changes. It returns true if all the data sources were able to save the changes, false otherwise.

_reverse_all_changes
$bool = $context->_reverse_all_changes();

_reverse_all_changes() is the workhorse behind "rollback".

For each class, it goes through each object of that class. If the object is a UR::Object::Ghost,
representing a deleted object, it converts the ghost back to the live version of the object. For
other classes, it makes a list of properties that have changed since they were loaded (represented by
the "db_committed" hash key in the object), and reverts those changes by using each property's
accessor method.

The Object Cache

       The object cache is integral to the way the Context works, and also the main difference  between  UR  and
       other  ORMs.   Other  systems do no caching and require the calling application to hold references to any
       objects it is interested in.  Say one part of the app loads data from  the  database  and  gives  up  its
       references,  then  if  another  part  of  the app does the same or similar query, it will have to ask the
       database again.

       UR handles caching of classes, objects and queries to avoid asking the  data  sources  for  data  it  has
       loaded  previously.   The  object  cache  is  essentially a software transaction that sits above whatever
       database transaction is active.  After objects are loaded, any changes, creations or deletions exist only
       in the object cache, and are not saved to the underlying data sources until  the  application  explicitly
       requests a commit or rollback.

       Objects  are  returned to the application only after they are inserted into the object cache.  This means
       that if disconnected parts of the application are returned objects with the same class and ID, they  will
       have  references  to the same exact object reference, and changes made in one part will be visible to all
       other parts of the app.  An unchanged object can  be  removed  from  the  object  cache  by  calling  its
       "unload()" method.

       Since  changes  to  the  underlying  data  sources  are  effectively  delayed,  it  is  possible that the
       application's notion of the object's current state does not match the data stored  in  the  data  source.
       You  can  mitigate  this  by using the "load()" class or object method to fetch the latest data if it's a
       problem.  Another issue to be aware of is if multiple programs are likely to commit  conflicting  changes
       to  the  same data, then whichever applies its changes last will win; some kind of external locking needs
       to be applied.  Finally, if two programs attempt to insert data with the same ID columns  into  an  RDBMS
       table, the second application's commit will fail, since that will likely violate a constraint.

   Object Change Tracking
       As  objects  are  loaded from their data sources, their properties are initialized with the data from the
       query, and a copy of the same data is stored in the object in its "db_committed" hash  key.   Anyone  can
       ask  the object for a list of its changes by calling "$obj->changed".  Internally, changed() goes through
       all the object's properties, comparing the current values in the object's hash with the same  keys  under
       'db_committed'.

       Objects created through the "create()" class method have no 'db_committed', and so the object knows it it
       a newly created object in this context.

       Every  time  an  object is retrieved with get() or through an iterator, it is assigned a serial number in
       its "__get_serial" hash key from  the  $UR::Context::GET_SERIAL  counter.   This  number  is  unique  and
       increases  with  each  get(),  and  is  used  by  the  "Object Cache Pruner" to expire the least recently
       requested data.

       Objects also track what parameters have been used to get() them in the hash "$obj->{__load}".  This is  a
       copy  of the data in "$UR::Context::all_params_loaded->{$template_id}".  For each rule ID, it will have a
       count of the number of times that rule was used in a get().

   Deleted Objects and Ghosts
       Calling delete() on an object is tracked in a different way.  First, a new object is  created,  called  a
       ghost.   Ghost  classes exist for every class in the application and are subclasses of UR::Object::Ghost.
       For example, the ghost class for MyClass is MyClass::Ghost.  This ghost object is  initialized  with  the
       data  from  the  original object.  The original object is removed from the object cache, and is reblessed
       into the UR::DeletedRef class.  Any attempt to interact with the object further will raise an exception.

       Ghost objects are not included in a get() request on the regular class, though the app can ask  for  them
       specificly using "MyClass::Ghost->get(%params)".

       Ghost  classes  do  not  have ghost classes themselves.  Calling create() or delete() on a Ghost class or
       object will raise an exception.  Calling other methods on the Ghost object that exist  on  the  original,
       live class will delegate over to the live class's method.

   all_objects_are_loaded
       $UR::Context::all_objects_are_loaded  is  a  hashref  keyed  by  class names.  If the value is true, then
       "_cache_is_complete_for_class_and_normalized_rule" knows that all the instances of that  class  exist  in
       the object cache, and it can avoid asking the underlying context/datasource for that class' data.

   all_params_loaded
       $UR::Context::all_params_loaded  is  a  two-level  hashref.   The first level is class names.  The second
       level is rule (UR::BoolExpr) IDs.  The values are how many times that class and rule have  been  involved
       in  a  get().   This data is used by "_loading_was_done_before_with_a_superset_of_this_params_hashref" to
       determine if the requested data will be found in the object cache for non-id queries.

   all_objects_loaded
       $UR::Context::all_objects_loaded is a two-level hashref.  The first level is  class  names.   The  second
       level  is  object IDs.  Every time an object is created, defined or loaded from an underlying context, it
       is inserted into the "all_objects_loaded" hash.  For queries involving only ID  properties,  the  Context
       can retrieve them directly out of the cache if they appear there.

       The entire cache can be purged of non-infrastructional objects by calling "clear_cache".

   Object Cache Pruner
       The  default  Context behavior is to cache all objects it knows about for the entire life of the process.
       For programs that churn through large amounts of data, or live for a long time, this is probably not what
       you want.

       The   Context   has   two   settings   to   loosely   control   the   size   of   the    object    cache.
       "object_cache_size_highwater"  and  "object_cache_size_lowwater".   As  objects are created and loaded, a
       count of  uncachable  objects  is  kept  in  $UR::Context::all_objects_cache_size.   The  first  part  of
       "get_objects_for_class_and_rule" checks to see of the current size is greater than the highwater setting,
       and call "prune_object_cache" if so.

       prune_object_cache() works by looking at what $UR::Context::GET_SERIAL was the last time it ran, and what
       it  is  now, and making a guess about what object serial number to use as a guide for removing objects by
       starting at 10% of the difference between the last serial  and  the  current  value,  called  the  target
       serial.

       It  then  starts  executing  a  loop  as long as $UR::Context::all_objects_cache_size is greater than the
       lowwater setting.  For each uncachable object, if its "__get_serial" is less than the target  serial,  it
       is  weakened  from  any UR::Object::Indexes it may be a member of, and then weakened from the main object
       cache, $UR::Context::all_objects_loaded.

       The application may lock an object in the cache by calling "__strengthen__" on it,  Likewise, the app may
       hint to the pruner to throw away an object as soon as possible by calling "__weaken__".