• Introduction
  • Foundations
  • Data structures
  • OO field guide

Environments

  • Exceptions and debugging
  • Functional programming
  • Functionals
  • Function operators
  • Metaprogramming
  • Non-standard evaluation
  • Expressions
  • Domain specific languages
  • Performant code
  • Performance
  • R's C interface

Advanced R by Hadley Wickham

The environment is the data structure that powers scoping. This chapter dives deep into environments, describing their structure in depth, and using them to improve your understanding of the four scoping rules described in lexical scoping .

Environments can also be useful data structures in their own right because they have reference semantics. When you modify a binding in an environment, the environment is not copied; it’s modified in place. Reference semantics are not often needed, but can be extremely useful.

If you can answer the following questions correctly, you already know the most important topics in this chapter. You can find the answers at the end of the chapter in answers .

List at least three ways that an environment is different to a list.

What is the parent of the global environment? What is the only environment that doesn’t have a parent?

What is the enclosing environment of a function? Why is it important?

How do you determine the environment from which a function was called?

How are <- and <<- different?

Environment basics introduces you to the basic properties of an environment and shows you how to create your own.

Recursing over environments provides a function template for computing with environments, illustrating the idea with a useful function.

Function environments revises R’s scoping rules in more depth, showing how they correspond to four types of environment associated with each function.

Binding names to values describes the rules that names must follow (and how to bend them), and shows some variations on binding a name to a value.

Explicit environments discusses three problems where environments are useful data structures in their own right, independent of the role they play in scoping.

Prerequisites

This chapter uses many functions from the pryr package to pry open R and look inside at the messy details. You can install pryr by running install.packages("pryr")

Environment basics

The job of an environment is to associate, or bind , a set of names to a set of values. You can think of an environment as a bag of names:

Each name points to an object stored elsewhere in memory:

The objects don’t live in the environment so multiple names can point to the same object:

Confusingly they can also point to different objects that have the same value:

If an object has no names pointing to it, it gets automatically deleted by the garbage collector. This process is described in more detail in gc .

Every environment has a parent, another environment. In diagrams, I’ll represent the pointer to parent with a small black circle. The parent is used to implement lexical scoping: if a name is not found in an environment, then R will look in its parent (and so on). Only one environment doesn’t have a parent: the empty environment.

We use the metaphor of a family to refer to environments. The grandparent of an environment is the parent’s parent, and the ancestors include all parent environments up to the empty environment. It’s rare to talk about the children of an environment because there are no back links: given an environment we have no way to find its children.

Generally, an environment is similar to a list, with four important exceptions:

Every name in an environment is unique.

The names in an environment are not ordered (i.e., it doesn’t make sense to ask what the first element of an environment is).

An environment has a parent.

Environments have reference semantics.

More technically, an environment is made up of two components, the frame , which contains the name-object bindings (and behaves much like a named list), and the parent environment. Unfortunately “frame” is used inconsistently in R. For example, parent.frame() doesn’t give you the parent frame of an environment. Instead, it gives you the calling environment. This is discussed in more detail in calling environments .

There are four special environments:

The globalenv() , or global environment, is the interactive workspace. This is the environment in which you normally work. The parent of the global environment is the last package that you attached with library() or require() .

The baseenv() , or base environment, is the environment of the base package. Its parent is the empty environment.

The emptyenv() , or empty environment, is the ultimate ancestor of all environments, and the only environment without a parent.

The environment() is the current environment.

search() lists all parents of the global environment. This is called the search path because objects in these environments can be found from the top-level interactive workspace. It contains one environment for each attached package and any other objects that you’ve attach() ed. It also contains a special environment called Autoloads which is used to save memory by only loading package objects (like big datasets) when needed.

You can access any environment on the search list using as.environment() .

globalenv() , baseenv() , the environments on the search path, and emptyenv() are connected as shown below. Each time you load a new package with library() it is inserted between the global environment and the package that was previously at the top of the search path.

To create an environment manually, use new.env() . You can list the bindings in the environment’s frame with ls() and see its parent with parent.env() .

The easiest way to modify the bindings in an environment is to treat it like a list:

By default, ls() only shows names that don’t begin with . . Use all.names = TRUE to show all bindings in an environment:

Another useful way to view an environment is ls.str() . It is more useful than str() because it shows each object in the environment. Like ls() , it also has an all.names argument.

Given a name, you can extract the value to which it is bound with $ , [[ , or get() :

$ and [[ look only in one environment and return NULL if there is no binding associated with the name.

get() uses the regular scoping rules and throws an error if the binding is not found.

Deleting objects from environments works a little differently from lists. With a list you can remove an entry by setting it to NULL . In environments, that will create a new binding to NULL . Instead, use rm() to remove the binding.

You can determine if a binding exists in an environment with exists() . Like get() , its default behaviour is to follow the regular scoping rules and look in parent environments. If you don’t want this behavior, use inherits = FALSE :

To compare environments, you must use identical() not == :

List three ways in which an environment differs from a list.

If you don’t supply an explicit environment, where do ls() and rm() look? Where does <- make bindings?

Using parent.env() and a loop (or a recursive function), verify that the ancestors of globalenv() include baseenv() and emptyenv() . Use the same basic idea to implement your own version of search() .

Recursing over environments

Environments form a tree, so it’s often convenient to write a recursive function. This section shows you how by applying your new knowledge of environments to understand the helpful pryr::where() . Given a name, where() finds the environment where that name is defined, using R’s regular scoping rules:

The definition of where() is straightforward. It has two arguments: the name to look for (as a string), and the environment in which to start the search. (We’ll learn later why parent.frame() is a good default in calling environments .)

There are three cases:

The base case: we’ve reached the empty environment and haven’t found the binding. We can’t go any further, so we throw an error.

The successful case: the name exists in this environment, so we return the environment.

The recursive case: the name was not found in this environment, so try the parent.

It’s easier to see what’s going on with an example. Imagine you have two environments as in the following diagram:

If you’re looking for a , where() will find it in the first environment.

If you’re looking for b , it’s not in the first environment, so where() will look in its parent and find it there.

If you’re looking for c , it’s not in the first environment, or the second environment, so where() reaches the empty environment and throws an error.

It’s natural to work with environments recursively, so where() provides a useful template. Removing the specifics of where() shows the structure more clearly:

Iteration vs. recursion

It’s possible to use a loop instead of recursion. This might run slightly faster (because we eliminate some function calls), but I think it’s harder to understand. I include it because you might find it easier to see what’s happening if you’re less familiar with recursive functions.

Modify where() to find all environments that contain a binding for name .

Write your own version of get() using a function written in the style of where() .

Write a function called fget() that finds only function objects. It should have two arguments, name and env , and should obey the regular scoping rules for functions: if there’s an object with a matching name that’s not a function, look in the parent. For an added challenge, also add an inherits argument which controls whether the function recurses up the parents or only looks in one environment.

Write your own version of exists(inherits = FALSE) (Hint: use ls() .) Write a recursive version that behaves like exists(inherits = TRUE) .

Function environments

Most environments are not created by you with new.env() but are created as a consequence of using functions. This section discusses the four types of environments associated with a function: enclosing, binding, execution, and calling.

The enclosing environment is the environment where the function was created. Every function has one and only one enclosing environment. For the three other types of environment, there may be 0, 1, or many environments associated with each function:

Binding a function to a name with <- defines a binding environment.

Calling a function creates an ephemeral execution environment that stores variables created during execution.

Every execution environment is associated with a calling environment, which tells you where the function was called.

The following sections will explain why each of these environments is important, how to access them, and how you might use them.

The enclosing environment

When a function is created, it gains a reference to the environment where it was made. This is the enclosing environment and is used for lexical scoping. You can determine the enclosing environment of a function by calling environment() with a function as its first argument:

In diagrams, I’ll depict functions as rounded rectangles. The enclosing environment of a function is given by a small black circle:

Binding environments

The previous diagram is too simple because functions don’t have names. Instead, the name of a function is defined by a binding. The binding environments of a function are all the environments which have a binding to it. The following diagram better reflects this relationship because the enclosing environment contains a binding from f to the function:

In this case the enclosing and binding environments are the same. They will be different if you assign a function into a different environment:

The enclosing environment belongs to the function, and never changes, even if the function is moved to a different environment. The enclosing environment determines how the function finds values; the binding environments determine how we find the function.

The distinction between the binding environment and the enclosing environment is important for package namespaces. Package namespaces keep packages independent. For example, if package A uses the base mean() function, what happens if package B creates its own mean() function? Namespaces ensure that package A continues to use the base mean() function, and that package A is not affected by package B (unless explicitly asked for).

Namespaces are implemented using environments, taking advantage of the fact that functions don’t have to live in their enclosing environments. For example, take the base function sd() . Its binding and enclosing environments are different:

The definition of sd() uses var() , but if we make our own version of var() it doesn’t affect sd() :

This works because every package has two environments associated with it: the package environment and the namespace environment. The package environment contains every publicly accessible function, and is placed on the search path. The namespace environment contains all functions (including internal functions), and its parent environment is a special imports environment that contains bindings to all the functions that the package needs. Every exported function in a package is bound into the package environment, but enclosed by the namespace environment. This complicated relationship is illustrated by the following diagram:

When we type var into the console, it’s found first in the global environment. When sd() looks for var() it finds it first in its namespace environment so never looks in the globalenv() .

Execution environments

What will the following function return the first time it’s run? What about the second?

This function returns the same value every time it is called because of the fresh start principle, described in a fresh start . Each time a function is called, a new environment is created to host execution. The parent of the execution environment is the enclosing environment of the function. Once the function has completed, this environment is thrown away.

Let’s depict that graphically with a simpler function. I draw execution environments around the function they belong to with a dotted border.

When you create a function inside another function, the enclosing environment of the child function is the execution environment of the parent, and the execution environment is no longer ephemeral. The following example illustrates that idea with a function factory, plus() . We use that factory to create a function called plus_one() . The enclosing environment of plus_one() is the execution environment of plus() where x is bound to the value 1.

You’ll learn more about function factories in functional programming .

Calling environments

Look at the following code. What do you expect i() to return when the code is run?

The top-level x (bound to 20) is a red herring: using the regular scoping rules, h() looks first where it is defined and finds that the value associated with x is 10. However, it’s still meaningful to ask what value x is associated within the environment where i() is called: x is 10 in the environment where h() is defined, but it is 20 in the environment where h() is called.

We can access this environment using the unfortunately named parent.frame() . This function returns the environment where the function was called. We can also use this function to look up the value of names in that environment:

In more complicated scenarios, there’s not just one parent call, but a sequence of calls which lead all the way back to the initiating function, called from the top-level. The following code generates a call stack three levels deep. The open-ended arrows represent the calling environment of each execution environment.

Note that each execution environment has two parents: a calling environment and an enclosing environment. R’s regular scoping rules only use the enclosing parent; parent.frame() allows you to access the calling parent.

Looking up variables in the calling environment rather than in the enclosing environment is called dynamic scoping . Few languages implement dynamic scoping (Emacs Lisp is a notable exception .) This is because dynamic scoping makes it much harder to reason about how a function operates: not only do you need to know how it was defined, you also need to know in what context it was called. Dynamic scoping is primarily useful for developing functions that aid interactive data analysis. It is one of the topics discussed in non-standard evaluation .

List the four environments associated with a function. What does each one do? Why is the distinction between enclosing and binding environments particularly important?

Draw a diagram that shows the enclosing environments of this function:

Expand your previous diagram to show function bindings.

Expand it again to show the execution and calling environments.

Write an enhanced version of str() that provides more information about functions. Show where the function was found and what environment it was defined in.

Binding names to values

Assignment is the act of binding (or rebinding) a name to a value in an environment. It is the counterpart to scoping, the set of rules that determines how to find the value associated with a name. Compared to most languages, R has extremely flexible tools for binding names to values. In fact, you can not only bind values to names, but you can also bind expressions (promises) or even functions, so that every time you access the value associated with a name, you get something different!

You’ve probably used regular assignment in R thousands of times. Regular assignment creates a binding between a name and an object in the current environment. Names usually consist of letters, digits, . and _ , and can’t begin with _ . If you try to use a name that doesn’t follow these rules, you get an error:

Reserved words (like TRUE , NULL , if , and function ) follow the rules but are reserved by R for other purposes:

A complete list of reserved words can be found in ?Reserved .

It’s possible to override the usual rules and use a name with any sequence of characters by surrounding the name with backticks:

You can also create non-syntactic bindings using single and double quotes instead of backticks, but I don’t recommend it. The ability to use strings on the left hand side of the assignment arrow is a historical artefact, used before R supported backticks.

The regular assignment arrow, <- , always creates a variable in the current environment. The deep assignment arrow, <<- , never creates a variable in the current environment, but instead modifies an existing variable found by walking up the parent environments.

If <<- doesn’t find an existing variable, it will create one in the global environment. This is usually undesirable, because global variables introduce non-obvious dependencies between functions. <<- is most often used in conjunction with a closure, as described in Closures .

There are two other special types of binding, delayed and active:

Rather than assigning the result of an expression immediately, a delayed binding creates and stores a promise to evaluate the expression when needed. We can create delayed bindings with the special assignment operator %<d-% , provided by the pryr package.

%<d-% is a wrapper around the base delayedAssign() function, which you may need to use directly if you need more control. Delayed bindings are used to implement autoload() , which makes R behave as if the package data is in memory, even though it’s only loaded from disk when you ask for it.

Active are not bound to a constant object. Instead, they’re re-computed every time they’re accessed:

%<a-% is a wrapper for the base function makeActiveBinding() . You may want to use this function directly if you want more control. Active bindings are used to implement reference class fields.

What does this function do? How does it differ from <<- and why might you prefer it?

Create a version of assign() that will only bind new names, never re-bind old names. Some programming languages only do this, and are known as single assignment languages .

Write an assignment function that can do active, delayed, and locked bindings. What might you call it? What arguments should it take? Can you guess which sort of assignment it should do based on the input?

Explicit environments

As well as powering scoping, environments are also useful data structures in their own right because they have reference semantics . Unlike most objects in R, when you modify an environment, it does not make a copy. For example, look at this modify() function.

If you apply it to a list, the original list is not changed because modifying a list actually creates and modifies a copy.

However, if you apply it to an environment, the original environment is modified:

Just as you can use a list to pass data between functions, you can also use an environment. When creating your own environment, note that you should set its parent environment to be the empty environment. This ensures you don’t accidentally inherit objects from somewhere else:

Environments are data structures useful for solving three common problems:

  • Avoiding copies of large data.
  • Managing state within a package.
  • Efficiently looking up values from names.

These are described in turn below.

Avoiding copies

Since environments have reference semantics, you’ll never accidentally create a copy. This makes it a useful vessel for large objects. It’s a common technique for bioconductor packages which often have to manage large genomic objects. Changes to R 3.1.0 have made this use substantially less important because modifying a list no longer makes a deep copy. Previously, modifying a single element of a list would cause every element to be copied, an expensive operation if some elements are large. Now, modifying a list efficiently reuses existing vectors, saving much time.

Package state

Explicit environments are useful in packages because they allow you to maintain state across function calls. Normally, objects in a package are locked, so you can’t modify them directly. Instead, you can do something like this:

Returning the old value from setter functions is a good pattern because it makes it easier to reset the previous value in conjunction with on.exit() (see more in on exit ).

As a hashmap

A hashmap is a data structure that takes constant, O(1), time to find an object based on its name. Environments provide this behaviour by default, so can be used to simulate a hashmap. See the CRAN package hash for a complete development of this idea.

Quiz answers

There are four ways: every object in an environment must have a name; order doesn’t matter; environments have parents; environments have reference semantics.

The parent of the global environment is the last package that you loaded. The only environment that doesn’t have a parent is the empty environment.

The enclosing environment of a function is the environment where it was created. It determines where a function looks for variables.

Use parent.frame() .

<- always creates a binding in the current environment; <<- rebinds an existing name in a parent of the current environment.

© Hadley Wickham. Powered by jekyll , knitr , and pandoc . Source available on github .

7 Environments

7.1 introduction.

The environment is the data structure that powers scoping. This chapter dives deep into environments, describing their structure in depth, and using them to improve your understanding of the four scoping rules described in Section 6.4 . Understanding environments is not necessary for day-to-day use of R. But they are important to understand because they power many important R features like lexical scoping, namespaces, and R6 classes, and interact with evaluation to give you powerful tools for making domain specific languages, like dplyr and ggplot2.

If you can answer the following questions correctly, you already know the most important topics in this chapter. You can find the answers at the end of the chapter in Section 7.7 .

List at least three ways that an environment differs from a list.

What is the parent of the global environment? What is the only environment that doesn’t have a parent?

What is the enclosing environment of a function? Why is it important?

How do you determine the environment from which a function was called?

How are <- and <<- different?

Section 7.2 introduces you to the basic properties of an environment and shows you how to create your own.

Section 7.3 provides a function template for computing with environments, illustrating the idea with a useful function.

Section 7.4 describes environments used for special purposes: for packages, within functions, for namespaces, and for function execution.

Section 7.5 explains the last important environment: the caller environment. This requires you to learn about the call stack, that describes how a function was called. You’ll have seen the call stack if you’ve ever called traceback() to aid debugging.

Section 7.6 briefly discusses three places where environments are useful data structures for solving other problems.

Prerequisites

This chapter will use rlang functions for working with environments, because it allows us to focus on the essence of environments, rather than the incidental details.

The env_ functions in rlang are designed to work with the pipe: all take an environment as the first argument, and many also return an environment. I won’t use the pipe in this chapter in the interest of keeping the code as simple as possible, but you should consider it for your own code.

7.2 Environment basics

Generally, an environment is similar to a named list, with four important exceptions:

Every name must be unique.

The names in an environment are not ordered.

An environment has a parent.

Environments are not copied when modified.

Let’s explore these ideas with code and pictures.

7.2.1 Basics

To create an environment, use rlang::env() . It works like list() , taking a set of name-value pairs:

Use new.env() to create a new environment. Ignore the hash and size parameters; they are not needed. You cannot simultaneously create and define values; use $<- , as shown below.

The job of an environment is to associate, or bind , a set of names to a set of values. You can think of an environment as a bag of names, with no implied order (i.e. it doesn’t make sense to ask which is the first element in an environment). For that reason, we’ll draw the environment as so:

r assign parent environment

As discussed in Section 2.5.2 , environments have reference semantics: unlike most R objects, when you modify them, you modify them in place, and don’t create a copy. One important implication is that environments can contain themselves.

r assign parent environment

Printing an environment just displays its memory address, which is not terribly useful:

Instead, we’ll use env_print() which gives us a little more information:

You can use env_names() to get a character vector giving the current bindings

In R 3.2.0 and greater, use names() to list the bindings in an environment. If your code needs to work with R 3.1.0 or earlier, use ls() , but note that you’ll need to set all.names = TRUE to show all bindings.

7.2.2 Important environments

We’ll talk in detail about special environments in 7.4 , but for now we need to mention two. The current environment, or current_env() is the environment in which code is currently executing. When you’re experimenting interactively, that’s usually the global environment, or global_env() . The global environment is sometimes called your “workspace”, as it’s where all interactive (i.e. outside of a function) computation takes place.

To compare environments, you need to use identical() and not == . This is because == is a vectorised operator, and environments are not vectors.

Access the global environment with globalenv() and the current environment with environment() . The global environment is printed as R_GlobalEnv and .GlobalEnv .

7.2.3 Parents

Every environment has a parent , another environment. In diagrams, the parent is shown as a small pale blue circle and arrow that points to another environment. The parent is what’s used to implement lexical scoping: if a name is not found in an environment, then R will look in its parent (and so on). You can set the parent environment by supplying an unnamed argument to env() . If you don’t supply it, it defaults to the current environment. In the code below, e2a is the parent of e2b .

r assign parent environment

To save space, I typically won’t draw all the ancestors; just remember whenever you see a pale blue circle, there’s a parent environment somewhere.

You can find the parent of an environment with env_parent() :

Only one environment doesn’t have a parent: the empty environment. I draw the empty environment with a hollow parent environment, and where space allows I’ll label it with R_EmptyEnv , the name R uses.

r assign parent environment

The ancestors of every environment eventually terminate with the empty environment. You can see all ancestors with env_parents() :

By default, env_parents() stops when it gets to the global environment. This is useful because the ancestors of the global environment include every attached package, which you can see if you override the default behaviour as below. We’ll come back to these environments in Section 7.4.1 .

Use parent.env() to find the parent of an environment. No base function returns all ancestors.

7.2.4 Super assignment, <<-

The ancestors of an environment have an important relationship to <<- . Regular assignment, <- , always creates a variable in the current environment. Super assignment, <<- , never creates a variable in the current environment, but instead modifies an existing variable found in a parent environment.

If <<- doesn’t find an existing variable, it will create one in the global environment. This is usually undesirable, because global variables introduce non-obvious dependencies between functions. <<- is most often used in conjunction with a function factory, as described in Section 10.2.4 .

7.2.5 Getting and setting

You can get and set elements of an environment with $ and [[ in the same way as a list:

But you can’t use [[ with numeric indices, and you can’t use [ :

$ and [[ will return NULL if the binding doesn’t exist. Use env_get() if you want an error:

If you want to use a default value if the binding doesn’t exist, you can use the default argument.

There are two other ways to add bindings to an environment:

env_poke() 42 takes a name (as string) and a value:

env_bind() allows you to bind multiple values:

You can determine if an environment has a binding with env_has() :

Unlike lists, setting an element to NULL does not remove it, because sometimes you want a name that refers to NULL . Instead, use env_unbind() :

Unbinding a name doesn’t delete the object. That’s the job of the garbage collector, which automatically removes objects with no names binding to them. This process is described in more detail in Section 2.6 .

See get() , assign() , exists() , and rm() . These are designed interactively for use with the current environment, so working with other environments is a little clunky. Also beware the inherits argument: it defaults to TRUE meaning that the base equivalents will inspect the supplied environment and all its ancestors.

7.2.6 Advanced bindings

There are two more exotic variants of env_bind() :

env_bind_lazy() creates delayed bindings , which are evaluated the first time they are accessed. Behind the scenes, delayed bindings create promises, so behave in the same way as function arguments.

The primary use of delayed bindings is in autoload() , which allows R packages to provide datasets that behave like they are loaded in memory, even though they’re only loaded from disk when needed.

env_bind_active() creates active bindings which are re-computed every time they’re accessed:

Active bindings are used to implement R6’s active fields, which you’ll learn about in Section 14.3.2 .

See ?delayedAssign() and ?makeActiveBinding() .

7.2.7 Exercises

List three ways in which an environment differs from a list.

Create an environment as illustrated by this picture.

r assign parent environment

Create a pair of environments as illustrated by this picture.

r assign parent environment

Explain why e[[1]] and e[c("a", "b")] don’t make sense when e is an environment.

Create a version of env_poke() that will only bind new names, never re-bind old names. Some programming languages only do this, and are known as single assignment languages .

What does this function do? How does it differ from <<- and why might you prefer it?

7.3 Recursing over environments

If you want to operate on every ancestor of an environment, it’s often convenient to write a recursive function. This section shows you how, applying your new knowledge of environments to write a function that given a name, finds the environment where() that name is defined, using R’s regular scoping rules.

The definition of where() is straightforward. It has two arguments: the name to look for (as a string), and the environment in which to start the search. (We’ll learn why caller_env() is a good default in Section 7.5 .)

There are three cases:

The base case: we’ve reached the empty environment and haven’t found the binding. We can’t go any further, so we throw an error.

The successful case: the name exists in this environment, so we return the environment.

The recursive case: the name was not found in this environment, so try the parent.

These three cases are illustrated with these three examples:

It might help to see a picture. Imagine you have two environments, as in the following code and diagram:

r assign parent environment

where("a", e4b) will find a in e4b .

where("b", e4b) doesn’t find b in e4b , so it looks in its parent, e4a , and finds it there.

where("c", e4b) looks in e4b , then e4a , then hits the empty environment and throws an error.

It’s natural to work with environments recursively, so where() provides a useful template. Removing the specifics of where() shows the structure more clearly:

Iteration versus recursion

It’s possible to use a loop instead of recursion. I think it’s harder to understand than the recursive version, but I include it because you might find it easier to see what’s happening if you haven’t written many recursive functions.

7.3.1 Exercises

Modify where() to return all environments that contain a binding for name . Carefully think through what type of object the function will need to return.

Write a function called fget() that finds only function objects. It should have two arguments, name and env , and should obey the regular scoping rules for functions: if there’s an object with a matching name that’s not a function, look in the parent. For an added challenge, also add an inherits argument which controls whether the function recurses up the parents or only looks in one environment.

7.4 Special environments

Most environments are not created by you (e.g. with env() ) but are instead created by R. In this section, you’ll learn about the most important environments, starting with the package environments. You’ll then learn about the function environment bound to the function when it is created, and the (usually) ephemeral execution environment created every time the function is called. Finally, you’ll see how the package and function environments interact to support namespaces, which ensure that a package always behaves the same way, regardless of what other packages the user has loaded.

7.4.1 Package environments and the search path

Each package attached by library() or require() becomes one of the parents of the global environment. The immediate parent of the global environment is the last package you attached 43 , the parent of that package is the second to last package you attached, …

r assign parent environment

If you follow all the parents back, you see the order in which every package has been attached. This is known as the search path because all objects in these environments can be found from the top-level interactive workspace. You can see the names of these environments with base::search() , or the environments themselves with rlang::search_envs() :

The last two environments on the search path are always the same:

The Autoloads environment uses delayed bindings to save memory by only loading package objects (like big datasets) when needed.

The base environment, package:base or sometimes just base , is the environment of the base package. It is special because it has to be able to bootstrap the loading of all other packages. You can access it directly with base_env() .

Note that when you attach another package with library() , the parent environment of the global environment changes:

r assign parent environment

7.4.2 The function environment

A function binds the current environment when it is created. This is called the function environment , and is used for lexical scoping. Across computer languages, functions that capture (or enclose) their environments are called closures , which is why this term is often used interchangeably with function in R’s documentation.

You can get the function environment with fn_env() :

Use environment(f) to access the environment of function f .

In diagrams, I’ll draw a function as a rectangle with a rounded end that binds an environment.

r assign parent environment

In this case, f() binds the environment that binds the name f to the function. But that’s not always the case: in the following example g is bound in a new environment e , but g() binds the global environment. The distinction between binding and being bound by is subtle but important; the difference is how we find g versus how g finds its variables.

r assign parent environment

7.4.3 Namespaces

In the diagram above, you saw that the parent environment of a package varies based on what other packages have been loaded. This seems worrying: doesn’t that mean that the package will find different functions if packages are loaded in a different order? The goal of namespaces is to make sure that this does not happen, and that every package works the same way regardless of what packages are attached by the user.

For example, take sd() :

sd() is defined in terms of var() , so you might worry that the result of sd() would be affected by any function called var() either in the global environment, or in one of the other attached packages. R avoids this problem by taking advantage of the function versus binding environment described above. Every function in a package is associated with a pair of environments: the package environment, which you learned about earlier, and the namespace environment.

The package environment is the external interface to the package. It’s how you, the R user, find a function in an attached package or with :: . Its parent is determined by search path, i.e. the order in which packages have been attached.

The namespace environment is the internal interface to the package. The package environment controls how we find the function; the namespace controls how the function finds its variables.

Every binding in the package environment is also found in the namespace environment; this ensures every function can use every other function in the package. But some bindings only occur in the namespace environment. These are known as internal or non-exported objects, which make it possible to hide internal implementation details from the user.

r assign parent environment

Every namespace environment has the same set of ancestors:

Each namespace has an imports environment that contains bindings to all the functions used by the package. The imports environment is controlled by the package developer with the NAMESPACE file.

Explicitly importing every base function would be tiresome, so the parent of the imports environment is the base namespace . The base namespace contains the same bindings as the base environment, but it has a different parent.

The parent of the base namespace is the global environment. This means that if a binding isn’t defined in the imports environment the package will look for it in the usual way. This is usually a bad idea (because it makes code depend on other loaded packages), so R CMD check automatically warns about such code. It is needed primarily for historical reasons, particularly due to how S3 method dispatch works.

r assign parent environment

Putting all these diagrams together we get:

r assign parent environment

So when sd() looks for the value of var it always finds it in a sequence of environments determined by the package developer, but not by the package user. This ensures that package code always works the same way regardless of what packages have been attached by the user.

There’s no direct link between the package and namespace environments; the link is defined by the function environments.

7.4.4 Execution environments

The last important topic we need to cover is the execution environment. What will the following function return the first time it’s run? What about the second?

Think about it for a moment before you read on.

This function returns the same value every time because of the fresh start principle, described in Section 6.4.3 . Each time a function is called, a new environment is created to host execution. This is called the execution environment, and its parent is the function environment. Let’s illustrate that process with a simpler function. Figure 7.1 illustrates the graphical conventions: I draw execution environments with an indirect parent; the parent environment is found via the function environment.

The execution environment of a simple function call. Note that the parent of the execution environment is the function environment.

Figure 7.1: The execution environment of a simple function call. Note that the parent of the execution environment is the function environment.

An execution environment is usually ephemeral; once the function has completed, the environment will be garbage collected. There are several ways to make it stay around for longer. The first is to explicitly return it:

Another way to capture it is to return an object with a binding to that environment, like a function. The following example illustrates that idea with a function factory, plus() . We use that factory to create a function called plus_one() .

There’s a lot going on in the diagram because the enclosing environment of plus_one() is the execution environment of plus() .

r assign parent environment

What happens when we call plus_one() ? Its execution environment will have the captured execution environment of plus() as its parent:

r assign parent environment

You’ll learn more about function factories in Section 10.2 .

7.4.5 Exercises

How is search_envs() different from env_parents(global_env()) ?

Draw a diagram that shows the enclosing environments of this function:

Write an enhanced version of str() that provides more information about functions. Show where the function was found and what environment it was defined in.

7.5 Call stacks

There is one last environment we need to explain, the caller environment, accessed with rlang::caller_env() . This provides the environment from which the function was called, and hence varies based on how the function is called, not how the function was created. As we saw above this is a useful default whenever you write a function that takes an environment as an argument.

parent.frame() is equivalent to caller_env() ; just note that it returns an environment, not a frame.

To fully understand the caller environment we need to discuss two related concepts: the call stack , which is made up of frames . Executing a function creates two types of context. You’ve learned about one already: the execution environment is a child of the function environment, which is determined by where the function was created. There’s another type of context created by where the function was called: this is called the call stack.

7.5.1 Simple call stacks

Let’s illustrate this with a simple sequence of calls: f() calls g() calls h() .

The way you most commonly see a call stack in R is by looking at the traceback() after an error has occurred:

Instead of stop() + traceback() to understand the call stack, we’re going to use lobstr::cst() to print out the c all s tack t ree:

This shows us that cst() was called from h() , which was called from g() , which was called from f() . Note that the order is the opposite from traceback() . As the call stacks get more complicated, I think it’s easier to understand the sequence of calls if you start from the beginning, rather than the end (i.e.  f() calls g() ; rather than g() was called by f() ).

7.5.2 Lazy evaluation

The call stack above is simple: while you get a hint that there’s some tree-like structure involved, everything happens on a single branch. This is typical of a call stack when all arguments are eagerly evaluated.

Let’s create a more complicated example that involves some lazy evaluation. We’ll create a sequence of functions, a() , b() , c() , that pass along an argument x .

x is lazily evaluated so this tree gets two branches. In the first branch a() calls b() , then b() calls c() . The second branch starts when c() evaluates its argument x . This argument is evaluated in a new branch because the environment in which it is evaluated is the global environment, not the environment of c() .

7.5.3 Frames

Each element of the call stack is a frame 44 , also known as an evaluation context. The frame is an extremely important internal data structure, and R code can only access a small part of the data structure because tampering with it will break R. A frame has three key components:

An expression (labelled with expr ) giving the function call. This is what traceback() prints out.

An environment (labelled with env ), which is typically the execution environment of a function. There are two main exceptions: the environment of the global frame is the global environment, and calling eval() also generates frames, where the environment can be anything.

A parent, the previous call in the call stack (shown by a grey arrow).

Figure 7.2 illustrates the stack for the call to f(x = 1) shown in Section 7.5.1 .

The graphical depiction of a simple call stack

Figure 7.2: The graphical depiction of a simple call stack

(To focus on the calling environments, I have omitted the bindings in the global environment from f , g , and h to the respective function objects.)

The frame also holds exit handlers created with on.exit() , restarts and handlers for the condition system, and which context to return() to when a function completes. These are important internal details that are not accessible with R code.

7.5.4 Dynamic scope

Looking up variables in the calling stack rather than in the enclosing environment is called dynamic scoping . Few languages implement dynamic scoping (Emacs Lisp is a notable exception .) This is because dynamic scoping makes it much harder to reason about how a function operates: not only do you need to know how it was defined, you also need to know the context in which it was called. Dynamic scoping is primarily useful for developing functions that aid interactive data analysis, and one of the topics discussed in Chapter 20 .

7.5.5 Exercises

  • Write a function that lists all the variables defined in the environment in which it was called. It should return the same results as ls() .

7.6 As data structures

As well as powering scoping, environments are also useful data structures in their own right because they have reference semantics. There are three common problems that they can help solve:

Avoiding copies of large data . Since environments have reference semantics, you’ll never accidentally create a copy. But bare environments are painful to work with, so instead I recommend using R6 objects, which are built on top of environments. Learn more in Chapter 14 .

Managing state within a package . Explicit environments are useful in packages because they allow you to maintain state across function calls. Normally, objects in a package are locked, so you can’t modify them directly. Instead, you can do something like this:

Returning the old value from setter functions is a good pattern because it makes it easier to reset the previous value in conjunction with on.exit() (Section 6.7.4 ).

As a hashmap . A hashmap is a data structure that takes constant, O(1), time to find an object based on its name. Environments provide this behaviour by default, so can be used to simulate a hashmap. See the hash package 45 for a complete development of this idea.

7.7 Quiz answers

There are four ways: every object in an environment must have a name; order doesn’t matter; environments have parents; environments have reference semantics.

The parent of the global environment is the last package that you loaded. The only environment that doesn’t have a parent is the empty environment.

The enclosing environment of a function is the environment where it was created. It determines where a function looks for variables.

Use caller_env() or parent.frame() .

<- always creates a binding in the current environment; <<- rebinds an existing name in a parent of the current environment.

assign: Assign a Value to a Name

Description.

Assign a value to a name in an environment.

a variable name, given as a character string. No coercion is done, and the first element of a character vector of length greater than one will be used, with a warning.

a value to be assigned to x .

where to do the assignment. By default, assigns into the current environment. See ‘Details’ for other possibilities.

the environment to use. See ‘Details’.

should the enclosing frames of the environment be inspected?

an ignored compatibility feature.

This function is invoked for its side effect, which is assigning value to the variable x . If no envir is specified, then the assignment takes place in the currently active environment.

If inherits is TRUE , enclosing environments of the supplied environment are searched until the variable x is encountered. The value is then assigned in the environment in which the variable is encountered (provided that the binding is not locked: see lockBinding : if it is, an error is signaled). If the symbol is not encountered then assignment takes place in the user's workspace (the global environment).

If inherits is FALSE , assignment takes place in the initial frame of envir , unless an existing binding is locked or there is no existing binding and the environment is locked (when an error is signaled).

There are no restrictions on the name given as x : it can be a non-syntactic name (see make.names ).

The pos argument can specify the environment in which to assign the object in any of several ways: as -1 (the default), as a positive integer (the position in the search list); as the character string name of an element in the search list; or as an environment (including using sys.frame to access the currently active function calls). The envir argument is an alternative way to specify an environment, but is primarily for back compatibility.

assign does not dispatch assignment methods, so it cannot be used to set elements of vectors, names, attributes, etc.

Note that assignment to an attached list or data frame changes the attached copy and not the original object: see attach and with .

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language . Wadsworth & Brooks/Cole.

<- , get , the inverse of assign() , exists , environment .

Run the code above in your browser using DataCamp Workspace

Hands-On Programming with R

8 environments.

Your deck is now ready for a game of blackjack (or hearts or war), but are your shuffle and deal functions up to snuff? Definitely not. For example, deal deals the same card over and over again:

And the shuffle function doesn’t actually shuffle deck (it returns a copy of deck that has been shuffled). In short, both of these functions use deck , but neither manipulates deck —and we would like them to.

To fix these functions, you will need to learn how R stores, looks up, and manipulates objects like deck . R does all of these things with the help of an environment system.

8.1 Environments

Consider for a moment how your computer stores files. Every file is saved in a folder, and each folder is saved in another folder, which forms a hierarchical file system. If your computer wants to open up a file, it must first look up the file in this file system.

You can see your file system by opening a finder window. For example, Figure 8.1 shows part of the file system on my computer. I have tons of folders. Inside one of them is a subfolder named Documents , inside of that subfolder is a sub-subfolder named ggsubplot , inside of that folder is a folder named inst , inside of that is a folder named doc , and inside of that is a file named manual.pdf .

Your computer arranges files into a hierarchy of folders and subfolders. To look at a file, you need to find where it is saved in the file system.

Figure 8.1: Your computer arranges files into a hierarchy of folders and subfolders. To look at a file, you need to find where it is saved in the file system.

R uses a similar system to save R objects. Each object is saved inside of an environment, a list-like object that resembles a folder on your computer. Each environment is connected to a parent environment , a higher-level environment, which creates a hierarchy of environments.

You can see R’s environment system with the parenvs function in the pryr package (note parenvs came in the pryr package when this book was first published). parenvs(all = TRUE) will return a list of the environments that your R session is using. The actual output will vary from session to session depending on which packages you have loaded. Here’s the output from my current session:

It takes some imagination to interpret this output, so let’s visualize the environments as a system of folders, Figure 8.2 . You can think of the environment tree like this. The lowest-level environment is named R_GlobalEnv and is saved inside an environment named package:pryr , which is saved inside the environment named 0x7fff3321c388 , and so on, until you get to the final, highest-level environment, R_EmptyEnv . R_EmptyEnv is the only R environment that does not have a parent environment.

R stores R objects in an environment tree that resembles your computer's folder system.

Figure 8.2: R stores R objects in an environment tree that resembles your computer’s folder system.

Remember that this example is just a metaphor. R’s environments exist in your RAM memory, and not in your file system. Also, R environments aren’t technically saved inside one another. Each environment is connected to a parent environment, which makes it easy to search up R’s environment tree. But this connection is one-way: there’s no way to look at one environment and tell what its “children” are. So you cannot search down R’s environment tree. In other ways, though, R’s environment system works similar to a file system.

8.2 Working with Environments

R comes with some helper functions that you can use to explore your environment tree. First, you can refer to any of the environments in your tree with as.environment . as.environment takes an environment name (as a character string) and returns the corresponding environment:

Three environments in your tree also come with their own accessor functions. These are the global environment ( R_GlobalEnv ), the base environment ( base ), and the empty environment ( R_EmptyEnv ). You can refer to them with:

Next, you can look up an environment’s parent with parent.env :

Notice that the empty environment is the only R environment without a parent:

You can view the objects saved in an environment with ls or ls.str . ls will return just the object names, but ls.str will display a little about each object’s structure:

The empty environment is—not surprisingly—empty; the base environment has too many objects to list here; and the global environment has some familiar faces. It is where R has saved all of the objects that you’ve created so far.

You can use R’s $ syntax to access an object in a specific environment. For example, you can access deck from the global environment:

And you can use the assign function to save an object into a particular environment. First give assign the name of the new object (as a character string). Then give assign the value of the new object, and finally the environment to save the object in:

Notice that assign works similar to <- . If an object already exists with the given name in the given environment, assign will overwrite it without asking for permission. This makes assign useful for updating objects but creates the potential for heartache.

Now that you can explore R’s environment tree, let’s examine how R uses it. R works closely with the environment tree to look up objects, store objects, and evaluate functions. How R does each of these tasks will depend on the current active environment.

8.2.1 The Active Environment

At any moment of time, R is working closely with a single environment. R will store new objects in this environment (if you create any), and R will use this environment as a starting point to look up existing objects (if you call any). I’ll call this special environment the active environment . The active environment is usually the global environment, but this may change when you run a function.

You can use environment to see the current active environment:

The global environment plays a special role in R. It is the active environment for every command that you run at the command line. As a result, any object that you create at the command line will be saved in the global environment. You can think of the global environment as your user workspace.

When you call an object at the command line, R will look for it first in the global environment. But what if the object is not there? In that case, R will follow a series of rules to look up the object.

8.3 Scoping Rules

R follows a special set of rules to look up objects. These rules are known as R’s scoping rules, and you’ve already met a couple of them:

  • R looks for objects in the current active environment.
  • When you work at the command line, the active environment is the global environment. Hence, R looks up objects that you call at the command line in the global environment.

Here is a third rule that explains how R finds objects that are not in the active environment

  • When R does not find an object in an environment, R looks in the environment’s parent environment, then the parent of the parent, and so on, until R finds the object or reaches the empty environment.

So, if you call an object at the command line, R will look for it in the global environment. If R can’t find it there, R will look in the parent of the global environment, and then the parent of the parent, and so on, working its way up the environment tree until it finds the object, as in Figure 8.3 . If R cannot find the object in any environment, it will return an error that says the object is not found.

R will search for an object by name in the active environment, here the global environment. If R does not find the object there, it will search in the active environment's parent, and then the parent's parent, and so on until R finds the object or runs out of environments.

Figure 8.3: R will search for an object by name in the active environment, here the global environment. If R does not find the object there, it will search in the active environment’s parent, and then the parent’s parent, and so on until R finds the object or runs out of environments.

8.4 Assignment

When you assign a value to an object, R saves the value in the active environment under the object’s name. If an object with the same name already exists in the active environment, R will overwrite it.

For example, an object named new exists in the global environment:

You can save a new object named new to the global environment with this command. R will overwrite the old object as a result:

This arrangement creates a quandary for R whenever R runs a function. Many functions save temporary objects that help them do their jobs. For example, the roll function from Project 1: Weighted Dice saved an object named die and an object named dice :

R must save these temporary objects in the active environment; but if R does that, it may overwrite existing objects. Function authors cannot guess ahead of time which names may already exist in your active environment. How does R avoid this risk? Every time R runs a function, it creates a new active environment to evaluate the function in.

8.5 Evaluation

R creates a new environment each time it evaluates a function. R will use the new environment as the active environment while it runs the function, and then R will return to the environment that you called the function from, bringing the function’s result with it. Let’s call these new environments runtime environments because R creates them at runtime to evaluate functions.

We’ll use the following function to explore R’s runtime environments. We want to know what the environments look like: what are their parent environments, and what objects do they contain? show_env is designed to tell us:

show_env is itself a function, so when we call show_env() , R will create a runtime environment to evaluate the function in. The results of show_env will tell us the name of the runtime environment, its parent, and which objects the runtime environment contains:

The results reveal that R created a new environment named 0x7ff711d12e28 to run show_env() in. The environment had no objects in it, and its parent was the global environment . So for purposes of running show_env , R’s environment tree looked like Figure 8.4 .

Let’s run show_env again:

This time show_env ran in a new environment, 0x7ff715f49808 . R creates a new environment each time you run a function. The 0x7ff715f49808 environment looks exactly the same as 0x7ff711d12e28 . It is empty and has the same global environment as its parent.

R creates a new environment to run show_env in. The environment is a child of the global environment.

Figure 8.4: R creates a new environment to run show_env in. The environment is a child of the global environment.

Now let’s consider which environment R will use as the parent of the runtime environment.

R will connect a function’s runtime environment to the environment that the function was first created in . This environment plays an important role in the function’s life—because all of the function’s runtime environments will use it as a parent. Let’s call this environment the origin environment . You can look up a function’s origin environment by running environment on the function:

The origin environment of show_env is the global environment because we created show_env at the command line, but the origin environment does not need to be the global environment. For example, the environment of parenvs is the pryr package:

In other words, the parent of a runtime environment will not always be the global environment; it will be whichever environment the function was first created in.

Finally, let’s look at the objects contained in a runtime environment. At the moment, show_env ’s runtime environments do not contain any objects, but that is easy to fix. Just have show_env create some objects in its body of code. R will store any objects created by show_env in its runtime environment. Why? Because the runtime environment will be the active environment when those objects are created:

This time when we run show_env , R stores a , b , and c in the runtime environment:

This is how R ensures that a function does not overwrite anything that it shouldn’t. Any objects created by the function are stored in a safe, out-of-the-way runtime environment.

R will also put a second type of object in a runtime environment. If a function has arguments, R will copy over each argument to the runtime environment. The argument will appear as an object that has the name of the argument but the value of whatever input the user provided for the argument. This ensures that a function will be able to find and use each of its arguments:

Let’s put this all together to see how R evaluates a function. Before you call a function, R is working in an active environment; let’s call this the calling environment . It is the environment R calls the function from.

Then you call the function. R responds by setting up a new runtime environment. This environment will be a child of the function’s origin enviornment. R will copy each of the function’s arguments into the runtime environment and then make the runtime environment the new active environment.

Next, R runs the code in the body of the function. If the code creates any objects, R stores them in the active, that is, runtime environment. If the code calls any objects, R uses its scoping rules to look them up. R will search the runtime environment, then the parent of the runtime environment (which will be the origin environment), then the parent of the origin environment, and so on. Notice that the calling environment might not be on the search path. Usually, a function will only call its arguments, which R can find in the active runtime environment.

Finally, R finishes running the function. It switches the active environment back to the calling environment. Now R executes any other commands in the line of code that called the function. So if you save the result of the function to an object with <- , the new object will be stored in the calling environment.

To recap, R stores its objects in an environment system. At any moment of time, R is working closely with a single active environment. It stores new objects in this environment, and it uses the environment as a starting point when it searches for existing objects. R’s active environment is usually the global environment, but R will adjust the active environment to do things like run functions in a safe manner.

How can you use this knowledge to fix the deal and shuffle functions?

First, let’s start with a warm-up question. Suppose I redefine deal at the command line like this:

Notice that deal no longer takes an argument, and it calls the deck object, which lives in the global environment.

When deal calls deck , R will need to look up the deck object. R’s scoping rules will lead it to the version of deck in the global environment, as in Figure 8.5 . deal works as expected as a result:

R finds deck by looking in the parent of deal's runtime environment. The parent is the global environment, deal's origin environment. Here, R finds the copy of deck.

Figure 8.5: R finds deck by looking in the parent of deal’s runtime environment. The parent is the global environment, deal’s origin environment. Here, R finds the copy of deck.

Now let’s fix the deal function to remove the cards it has dealt from deck . Recall that deal returns the top card of deck but does not remove the card from the deck. As a result, deal always returns the same card:

You know enough R syntax to remove the top card of deck . The following code will save a prisitine copy of deck and then remove the top card:

Now let’s add the code to deal . Here deal saves (and then returns) the top card of deck . In between, it removes the card from deck …or does it?

This code won’t work because R will be in a runtime environment when it executes deck <- deck[-1, ] . Instead of overwriting the global copy of deck with deck[-1, ] , deal will just create a slightly altered copy of deck in its runtime environment, as in Figure 8.6 .

The deal function looks up deck in the global environment but saves deck[-1, ] in the runtime environment as a new object named deck.

Figure 8.6: The deal function looks up deck in the global environment but saves deck[-1, ] in the runtime environment as a new object named deck.

Now deal will finally clean up the global copy of deck , and we can deal cards just as we would in real life:

Let’s turn our attention to the shuffle function:

shuffle(deck) doesn’t shuffle the deck object; it returns a shuffled copy of the deck object:

This behavior is now undesirable in two ways. First, shuffle fails to shuffle deck . Second, shuffle returns a copy of deck , which may be missing the cards that have been dealt away. It would be better if shuffle returned the dealt cards to the deck and then shuffled. This is what happens when you shuffle a deck of cards in real life.

Since DECK lives in the global environment, shuffle ’s environment of origin, shuffle will be able to find DECK at runtime. R will search for DECK first in shuffle ’s runtime environment, and then in shuffle ’s origin environment—the global environment—which is where DECK is stored.

The second line of shuffle will create a reordered copy of DECK and save it as deck in the global environment. This will overwrite the previous, nonshuffled version of deck .

8.6 Closures

Our system finally works. For example, you can shuffle the cards and then deal a hand of blackjack:

But the system requires deck and DECK to exist in the global environment. Lots of things happen in this environment, and it is possible that deck may get modified or erased by accident.

It would be better if we could store deck in a safe, out-of-the-way place, like one of those safe, out-of-the-way environments that R creates to run functions in. In fact, storing deck in a runtime environment is not such a bad idea.

You could create a function that takes deck as an argument and saves a copy of deck as DECK . The function could also save its own copies of deal and shuffle :

When you run setup , R will create a runtime environment to store these objects in. The environment will look like Figure 8.7 .

Now all of these things are safely out of the way in a child of the global environment. That makes them safe but hard to use. Let’s ask setup to return DEAL and SHUFFLE so we can use them. The best way to do this is to return the functions as a list:

Running setup will store deck and DECK in an out-of-the-way place, and create a DEAL and SHUFFLE function. Each of these objects will be stored in an environment whose parent is the global environment.

Figure 8.7: Running setup will store deck and DECK in an out-of-the-way place, and create a DEAL and SHUFFLE function. Each of these objects will be stored in an environment whose parent is the global environment.

Then you can save each of the elements of the list to a dedicated object in the global environment:

Now you can run deal and shuffle just as before. Each object contains the same code as the original deal and shuffle :

However, the functions now have one important difference. Their origin environment is no longer the global environment (although deal and shuffle are currently saved there). Their origin environment is the runtime environment that R made when you ran setup . That’s where R created DEAL and SHUFFLE , the functions copied into the new deal and shuffle , as shown in:

Why does this matter? Because now when you run deal or shuffle , R will evaluate the functions in a runtime environment that uses 0x7ff7169c3390 as its parent. DECK and deck will be in this parent environment, which means that deal and shuffle will be able to find them at runtime. DECK and deck will be in the functions’ search path but still out of the way in every other respect, as shown in Figure 8.8 .

Now deal and shuffle will be run in an environment that has the protected deck and DECK in its search path.

Figure 8.8: Now deal and shuffle will be run in an environment that has the protected deck and DECK in its search path.

This arrangement is called a closure . setup ’s runtime environment “encloses” the deal and shuffle functions. Both deal and shuffle can work closely with the objects contained in the enclosing environment, but almost nothing else can. The enclosing environment is not on the search path for any other R function or environment.

You may have noticed that deal and shuffle still update the deck object in the global environment. Don’t worry, we’re about to change that. We want deal and shuffle to work exclusively with the objects in the parent (enclosing) environment of their runtime environments. Instead of having each function reference the global environment to update deck , you can have them reference their parent environment at runtime, as shown in Figure 8.9 :

When you change your code, deal and shuffle will go from updating the global environment (left) to updating their parent environment (right).

Figure 8.9: When you change your code, deal and shuffle will go from updating the global environment (left) to updating their parent environment (right).

We finally have a self-contained card game. You can delete (or modify) the global copy of deck as much as you want and still play cards. deal and shuffle will use the pristine, protected copy of deck :

8.7 Summary

R saves its objects in an environment system that resembles your computer’s file system. If you understand this system, you can predict how R will look up objects. If you call an object at the command line, R will look for the object in the global environment and then the parents of the global environment, working its way up the environment tree one environment at a time.

R will use a slightly different search path when you call an object from inside of a function. When you run a function, R creates a new environment to execute commands in. This environment will be a child of the environment where the function was originally defined. This may be the global environment, but it also may not be. You can use this behavior to create closures, which are functions linked to objects in protected environments.

As you become familiar with R’s environment system, you can use it to produce elegant results, like we did here. However, the real value of understanding the environment system comes from knowing how R functions do their job. You can use this knowledge to figure out what is going wrong when a function does not perform as expected.

8.8 Project 2 Wrap-up

You now have full control over the data sets and values that you load into R. You can store data as R objects, you can retrieve and manipulate data values at will, and you can even predict how R will store and look up your objects in your computer’s memory.

You may not realize it yet, but your expertise makes you a powerful, computer-augmented data user. You can use R to save and work with larger data sets than you could otherwise handle. So far we’ve only worked with deck , a small data set; but you can use the same techniques to work with any data set that fits in your computer’s memory.

However, storing data is not the only logistical task that you will face as a data scientist. You will often want to do tasks with your data that are so complex or repetitive that they are difficult to do without a computer. Some of the things can be done with functions that already exist in R and its packages, but others cannot. You will be the most versatile as a data scientist if you can write your own programs for computers to follow. R can help you do this. When you are ready, Project 3: Slot Machine will teach you the most useful skills for writing programs in R.

Popular Tutorials

Popular examples, learn python interactively, r introduction.

  • R Reserved Words
  • R Variables and Constants

R Operators

  • R Operator Precedence and Associativitys

R Flow Control

  • R if…else Statement
  • R ifelse() Function
  • R while Loop
  • R break and next Statement
  • R repeat loop
  • R Functions
  • R Return Value from Function

R Environment and Scope

  • R Recursive Function
  • R Infix Operator
  • R switch() Function

R Data Structures

  • R Data Frame

R Object & Class

R Classes and Objects

R Reference Class

R Graphs & Charts

  • R Histograms
  • R Pie Chart
  • R Strip Chart

R Advanced Topics

  • R Plot Function
  • R Multiple Plots
  • Saving a Plot in R
  • R Plot Color

Related Topics

In this tutorial, you will learn everything about environment and scope in R programming with the help of examples.

In order to write functions in a proper way and avoid unusual errors, we need to know the concept of environment and scope in R.

  • R Programming Environment

Environment can be thought of as a collection of objects (functions, variables etc.). An environment is created when we first fire up the R interpreter. Any variable we define, is now in this environment.

The top level environment available to us at the R command prompt is the global environment called R_GlobalEnv . Global environment can be referred to as . GlobalEnv in R codes as well.

We can use the ls() function to show what variables and functions are defined in the current environment. Moreover, we can use the environment() function to get the current environment.

  • Example of environment() function

In the above example, we can see that a , b and f are in the R_GlobalEnv environment.

Notice that x (in the argument of the function) is not in this global environment. When we define a function, a new environment is created.

Here, the function f() creates a new environment inside the global environment.

Actually an environment has a frame, which has all the objects defined, and a pointer to the enclosing (parent) environment.

Hence, x is in the frame of the new environment created by the function f . This environment will also have a pointer to R_GlobalEnv .

  • Example: Cascading of environments

In the above example, we have defined two nested functions: f and g .

The g() function is defined inside the f() function. When the f() function is called, it creates a local variable g and defines the g() function within its own environment.

The g() function prints "Inside g" , displays its own environment using environment() , and lists the objects in its environment using ls() .

After that, the f() function prints "Inside f" , displays its own environment using environment() , and lists the objects in its environment using ls() .

  • R Programming Scope

In R programming, scope refers to the accessibility or visibility of objects (variables, functions, etc.) within different parts of your code.

In R, there are two main types of variables: global variables and local variables.

Let's consider an example:

Global Variables

Global variables are those variables which exist throughout the execution of a program. It can be changed and accessed from any part of the program.

However, global variables also depend upon the perspective of a function.

For example, in the above example, from the perspective of inner_func() , both a and b are global variables .

However, from the perspective of outer_func() , b is a local variable and only a is a global variable. The variable c is completely invisible to outer_func() .

Local Variables

On the other hand, local variables are those variables which exist only within a certain part of a program like a function, and are released when the function call ends.

In the above program the variable c is called a local variable .

If we assign a value to a variable with the function inner_func() , the change will only be local and cannot be accessed outside the function.

This is also the same even if names of both global variables and local variables match.

For example, if we have a function as below.

Here, the outer_func() function is defined, and within it, a local variable a is assigned the value 20 .

Inside outer_func() , there is an inner_func() function defined. The inner_func() function also has its own local variable a , which is assigned the value 30 .

When inner_func() is called within outer_func() , it prints the value of its local variable a (30) . Then, outer_func() continues executing and prints the value of its local variable a (20) .

Outside the functions, a global variable a is assigned the value 10 . This code then prints the value of the global variable a (10) .

  • Accessing global variables

Global variables can be read but when we try to assign to it, a new local variable is created instead.

To make assignments to global variables, super assignment operator, <<- , is used.

When using this operator within a function, it searches for the variable in the parent environment frame, if not found it keeps on searching the next level until it reaches the global environment.

If the variable is still not found, it is created and assigned at the global level.

When the statement a <<- 30 is encountered within inner_func() , it looks for the variable a in outer_func() environment.

When the search fails, it searches in R_GlobalEnv .

Since, a is not defined in this global environment as well, it is created and assigned there which is now referenced and printed from within inner_func() as well as outer_func() .

Table of Contents

  • Introduction

Sorry about that.

R Tutorials

Programming

Advanced R Solutions

5 environments, 5.1 environment basics.

  • environments have reference semantics
  • environments have parents
  • environments are not ordered
  • elements of environments need to be (uniquely) named

Q : If you don’t supply an explicit environment, where do ls() and rm() look? Where does <- make bindings? The A : ls() and rm look in their calling environments which they find by as.environment(-1) . From the book:

Assignment is the act of binding (or rebinding) a name to a value in an environment.

From ?`<-` :

The operators <- and = assign into the environment in which they are evaluated. The operator <- can be used anywhere, whereas the operator = is only allowed at the top level (e.g., in the complete expression typed at the command prompt) or as one of the subexpressions in a braced list of expressions.

Q : Using parent.env() and a loop (or a recursive function), verify that the ancestors of globalenv() include baseenv() and emptyenv() . Use the same basic idea to implement your own version of search() . A : We can print the ancestors for example by using a recursive function:

To implement a new version of search() we use a while statement:

5.2 Recursing over environments

Q : Modify where() to find all environments that contain a binding for name . A : We look at the source code of the original pryr::where() :

Since where() stops searching when a match appears, we copy the recursive call in the else block to the block of the matching (“success”) case, so that our new function where2 will look for a binding within the complete search path. We also need to pay attention to other details. We have to take care to save the bindings in an object, while not overriding it in our recursive calls. So we create a list object for that and define a new function within where2() that we call where2.internal . where2.internal() will do the recursive work and whenever it finds a binding it will write it via <<- to the especially created list in its enclosing environment:

Note that where2.internal() still provides the same structure as pryr::where does and you can also divide it in “base case”, “success case” and “recursive case”.

Q : Write your own version of get() using a function written in the style of where() . A : Note that get() provides a bit more arguments than our following version, but it should be easy to build up on that. However, we can change pryr::where to get2() with just changing one line of code (and the function name for the recursive call):

Q : Write a function called fget() that finds only function objects. It should have two arguments, name and env , and should obey the regular scoping rules for functions: if there’s an object with a matching name that’s not a function, look in the parent. For an added challenge, also add an inherits argument which controls whether the function recurses up the parents or only looks in one environment. A : We can build up our function on the implementation of get2() in the last exercise. We only need to add a check via is.function() , change the name (also in the recursive call) and the error message:

Note that this function is almost the same as the implementation of pryr::fget() :

We add an inherits parameter as described in the exercise:

Q : Write your own version of exists(inherits = FALSE) (Hint: use ls() .) Write a recursive version that behaves like exists(inherits = TRUE) . A : We write two versions. exists2() will be the case inherits = FALSE and exists3() inherits = TRUE :

5.3 Function environments

  • Enclosing: where the function is created
  • Binding: where the function was assigned
  • Execution: a temporary environment which is created when the function is executed
  • Calling: the environment from where the function was called

The difference between binding and enclosing environment is important, because of R’s lexical scoping rules. If R can’t find an object in the current environment while executing a function, it will look for it in the enclosing environment.

Q : Draw a diagram that shows the enclosing environments of this function:

r assign parent environment

Q : Expand your previous diagram to show function bindings. A :

r assign parent environment

Q : Expand it again to show the execution and calling environments. A :

r assign parent environment

Q : Write an enhanced version of str() that provides more information about functions. Show where the function was found and what environment it was defined in. A : Additionally we provide the function type in the sense of pryr::ftype . We use functions from the pryr package, since it provides helpers for all requested features:

Note that we wanted to have non standard evaluation like the original str() function. Since pryr::where() doesn’t support non standard evaluation, we needed to catch the name of the supplied object . Therefore we used expr_text() from the lazyeval package. As a result, fstr(object = packagename::functionname) will result in an error in contrast to str() .

5.4 Binding names to values

Q : What does this function do? How does it differ from <<- and why might you prefer it?

A : The function does “more or less” the same as <<- . Additionally to <<- it has an env argument, but this is not a big advantage, since also assign() provides this functionality. The main difference is that rebind() only does an assignment, when it finds a binding in one of the parent environments of env . Whereas:

If <<- doesn’t find an existing variable, it will create one in the global environment. This is usually undesirable, because global variables introduce non-obvious dependencies between functions.

Q : Create a version of assign() that will only bind new names, never re-bind old names. Some programming languages only do this, and are known as [single assignment languages][single assignment]. A : We take the formals from assign() ’s source code and define our new function. If x already exists, we give a message and return NULL (since this is the same as return() ). Otherwise we let the body of the assign() function do the work:

Note that .Internal(assign(x, value, envir, inherits)) , is not inside an else block or any other function. This is important. Otherwise we would change more of assign() than we want (in case of the assignment of a new function, the enclosing environment for that function would differ).

Q : Write an assignment function that can do active, delayed, and locked bindings. What might you call it? What arguments should it take? Can you guess which sort of assignment it should do based on the input? A : The following might be no optimal solution, but we can at least handle two of three cases via if statements. The problem already occured in the last exercise, were we had to do an assignment in an if statement and did a workaround. This workaround only works for one assignment (for logical reasons). We still use the workaround for the “delay case”, but we found a solution for the other two cases. The main aspect in it is to unify the environment were assign() , makeActiveBinding() and delayedAssign() act. We also had to test that cases like this

work with our new function and our function creates bindings (and so enclosing environments) in the same places as assign() would do, also when used inside funceions.

The usage of pryr:::to_env() simplified this process a lot:

We used all these thoughts to create the following function:

At the moment we have no idea for a good default guess routine, so that a specific atype of assignment would be done based on the input.

  • Main website
  • Rationalwiki / Oliver Smith
  • Donation options
  • In the media
  • Relation to Søren Kierkegaard?

R: assign() inside nested functions

  • Post author: Emil O. W. Kirkegaard
  • Post published: 30. December 2015
  • Post category: Programming

Recently, I wrote a function called copy_names(). It does what you think and a little more: it copies names from one object to another. But it can also attempt to do so even when the sizes of the objects’ dimensions do not match up perfectly. For instance:

Here we create a matrix and make a copy of it. Then we assign dimension names to the first object. Then we inspect both of them. Unsurprisingly, only the first has names (because R uses copy-on-modify semantics ). Then we call the copy function and then afterwards we see that the second gets the named copied. Hooray!

What if there is imperfect matching? The function will first check whether the number of dimensions is the same and if so, it checks each dimension to see if the lengths match in that dimension. If so, the names are copied. If not, nothing is done. For instance:

Here we create two matrices, but not of exactly the same sizes: the first is 3×2 and the second is 3×3. Then we assign dimnames to the first. Then we copy to the second and inspect. We see that the only the dimension that matched in length (i.e. the first) had the names copied.

How does it work?

Before I changed it, the code looked like this (including roxygen2 documentation ):

The call that does the trick is the last one, namely the one using assign() . Here we modify an object outside the function’s own environment . How do we know which one to modify? Well, we take one step back (pos = 1). Alternatively, one could have used <<- .

Inside nested functions

However, consider this scenario:

Here we define two functions, one of which calls the other. We also define x outside (in the global environment). Inside func1() we also define x to be another value. However, note the strange result inside func2. When asked to fetch x, which doesn’t exist in that function’s environment, it returns the value from the… global environment (i.e. x=1), not the func1() environment (x=2)! This is odd because func2() was called from func1(), so one would expect it to try getting it from there before trying the global environment. When we then call x in the global environment after the functions finish, we see that x has been changed there, not inside func2() as might be expected. This is a problem because if we call copy_names() inside a function, it is supposed to change the names of the object inside the function, not inside the global environment.

Why is this? It is complicated , but as far as I can make out, it is due to the difference between the calling environment (where we call the function from) and the enclosing environment (where it was created, in the case above the global environment). R by default will look up variables in the enclosing environment, not the calling environment. assign() using pos = 1 apparently does not work with the calling environments, but the enclosing environments, and hence it changes the value in the global environment, not the function that called it’s environment as intended.

The fix is to use the following line instead:

which then assigns the value to the object in the right environment, namely in func1()’s.

copy_names() part 2

This also means that copy_names() does not work within functions. For instance:

Above, we define a new function, get_loadings() , that fetches the loadings from a factor analysis object and transforms it into a clean data.frame by a roundabout way.* We see that the object returned did not keep the dimnames despite altho copy_names() being called. The fix is the same as above, calling assign with envir = parent.frame().

* The reason to use the roundabout way is that the loadings extracted have some odd properties that make them unusable in many functions and they also refuse to be converted to a data.frame. But it turns out that one can just change the class to “matrix” and then they are fine! So one doesn’t actually need copy_names() in this case after all.

You Might Also Like

Java, even worse than php, understanding restriction of range with shiny, installing the latest version of r on ubuntu/mint.

Get parent environments

Description.

env_parent() returns the parent environment of env if called with n = 1 , the grandparent with n = 2 , etc.

env_tail() searches through the parents and returns the one which has empty_env() as parent.

env_parents() returns the list of all parents, including the empty environment. This list is named using env_name() .

See the section on inheritance in env() 's documentation.

An environment for env_parent() and env_tail() , a list of environments for env_parents() .

assign: Assign a Value to a Name

Assign a value to a name, description.

Assign a value to a name in an environment.

There are no restrictions on the name given as x : it can be a non-syntactic name (see make.names ).

The pos argument can specify the environment in which to assign the object in any of several ways: as -1 (the default), as a positive integer (the position in the search list); as the character string name of an element in the search list; or as an environment (including using sys.frame to access the currently active function calls). The envir argument is an alternative way to specify an environment, but is primarily for back compatibility.

assign does not dispatch assignment methods, so it cannot be used to set elements of vectors, names, attributes, etc.

Note that assignment to an attached list or data frame changes the attached copy and not the original object: see attach and with .

This function is invoked for its side effect, which is assigning value to the variable x . If no envir is specified, then the assignment takes place in the currently active environment.

If inherits is TRUE , enclosing environments of the supplied environment are searched until the variable x is encountered. The value is then assigned in the environment in which the variable is encountered (provided that the binding is not locked: see lockBinding : if it is, an error is signaled). If the symbol is not encountered then assignment takes place in the user's workspace (the global environment).

If inherits is FALSE , assignment takes place in the initial frame of envir , unless an existing binding is locked or there is no existing binding and the environment is locked (when an error is signaled).

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language . Wadsworth & Brooks/Cole.

<- , get , the inverse of assign() , exists , environment .

R Package Documentation

Browse r packages, we want your feedback.

r assign parent environment

Add the following code to your website.

REMOVE THIS Copy to clipboard

For more information on customizing the embed code, read Embedding Snippets .

IMAGES

  1. R : Find parent environment within call stack by function name

    r assign parent environment

  2. 7 Environments

    r assign parent environment

  3. R : changing parent environment of closure in R (a good idea)?

    r assign parent environment

  4. Environments in R

    r assign parent environment

  5. Using Variables in R

    r assign parent environment

  6. javascript

    r assign parent environment

VIDEO

  1. OIC 20: How to create variables & How to assign values by using ASSIGN Action in OIC

  2. Project Views

  3. How to Elevate Yourself As A Protective Parent #shorts

  4. How to Assign and Track Homework on ClassDojo (Quick Tutorial)​​

  5. (Resolved) We Couldn’t Create A New Partition Or Locate An Existing One Lenovo Yoga الحل من الاخر

  6. طرق لتقسيم القرص في ويندوز

COMMENTS

  1. r

    EDIT: I think the term that is frequently used "calling environment" as opposed to "parent environment". r; Share. Improve this question. Follow edited Mar 18, 2017 at 12:35. Justin Thong. asked Mar 18 ... You can get a calling environment using parent.frame (don't confuse it with parent.env) and assign variables to it using $ or [ ...

  2. Environments · Advanced R.

    Its parent is the empty environment. The emptyenv(), or empty environment, is the ultimate ancestor of all environments, and the only environment without a parent. The environment() is the current environment. search() lists all parents of the global environment. This is called the search path because objects in these environments can be found ...

  3. 7 Environments

    7.1 Introduction. The environment is the data structure that powers scoping. This chapter dives deep into environments, describing their structure in depth, and using them to improve your understanding of the four scoping rules described in Section 6.4 . Understanding environments is not necessary for day-to-day use of R.

  4. Create a new environment

    All R environments (except the empty environment) are defined with a parent environment. ... (by assigning the environment to another symbol with <-or passing the environment as argument to a function), modifying the bindings of one of those references changes all other references as well. See also. env_has(), env_bind().

  5. assign function

    a variable name, given as a character string. No coercion is done, and the first element of a character vector of length greater than one will be used, with a warning. value. a value to be assigned to x. pos. where to do the assignment. By default, assigns into the current environment. See 'Details' for other possibilities.

  6. 8 Environments

    You can use R's $ syntax to access an object in a specific environment. For example, you can access deck from the global environment:. head (globalenv $ deck, 3) ## face suit value ## king spades 13 ## queen spades 12 ## jack spades 11. And you can use the assign function to save an object into a particular environment. First give assign the name of the new object (as a character string).

  7. R Environment and Scope (With Examples)

    Output. [1] "a" "b" "f". <environment: R_GlobalEnv>. In the above example, we can see that a, b and f are in the R_GlobalEnv environment. Notice that x (in the argument of the function) is not in this global environment. When we define a function, a new environment is created. Here, the function f() creates a new environment inside the global ...

  8. R: Environments and Variable Scope

    An environment differs from a list in that: (1) Names in an environment are not ordered; (2) Each environment has a parent environment, with the exception of one: empty_env(). 1.2. Global Environment

  9. 5 Environments

    The operators <-and = assign into the environment in which they are evaluated. The operator <-can be used anywhere, whereas the operator = is only allowed at the top level ... Using parent.env() and a loop (or a recursive function), verify that the ancestors of globalenv() include baseenv() and emptyenv().

  10. assign.parent : Assign a variable in the parent environment when

    Assign a variable in the parent environment when <<- doesn't seem to work. assign.parent: Assign a variable in the parent environment when <<-... in roxygen: Literate Programming in R rdrr.io Find an R package R language docs Run R in your browser

  11. R: assign() inside nested functions

    R by default will look up variables in the enclosing environment, not the calling environment. assign () using pos = 1 apparently does not work with the calling environments, but the enclosing environments, and hence it changes the value in the global environment, not the function that called it's environment as intended.

  12. Get or set the environment of an object

    These functions dispatch internally with methods for functions, formulas and frames. If called with a missing argument, the environment of the current evaluation frame is returned. If you call get_env() with an environment, it acts as the identity function and the environment is simply returned (this helps simplifying code when writing generic functions for environments).

  13. environment: Environment Access

    an arbitrary R object. hash: a logical, if TRUE the environment will use a hash table. parent: an environment to be used as the enclosure of the environment created. env: an environment. size: an integer specifying the initial size for a hashed environment. An internal default value will be used if size is NA or zero. This argument is ignored ...

  14. Get parent environments

    env_parent() returns the parent environment of env if called with n = 1, the grandparent with n = 2, etc. env_tail() searches through the parents and returns the one which has empty_env() as parent. env_parents() returns the list of all parents, including the empty environment. This list is named using env_name(). See the section on inheritance in env()'s documentation.

  15. R: Create a new environment

    Create a new environment Description. These functions create new environments. env() creates a child of the current environment by default and takes a variable number of named objects to populate it. new_environment() creates a child of the empty environment by default and takes a named list of objects to populate it. Usage env(...) new_environment(data = list(), parent = empty_env())

  16. R: Get parent environments

    Get parent environments Description. env_parent() returns the parent environment of env if called with n = 1, the grandparent with n = 2, etc. env_tail() searches through the parents and returns the one which has empty_env() as parent. env_parents() returns the list of all parents, including the empty environment. This list is named using env_name().. See the section on inheritance in env()'s ...

  17. r

    You doesn't actually seem like you are passing the parent argument when you make childEnv.You are relying on R type-matching the argument to parent which doesn't seem to work here. At the moment (and I am guessing) R is positional matching your environment to the hash argument and because it is not logical it is silently dropping it. Instead try this:

  18. r

    I have a parent function (fun1) that takes data and a helper function (fun2) that operates on columns from that data.I want to assign the value from that helper function to it's matching column in data in fun1.In reality there are lots of little helper functions operating on columns and I want the value changed by fun2 to be what the other helper functions deal with.

  19. assign: Assign a Value to a Name

    a variable name, given as a character string. No coercion is done, and the first element of a character vector of length greater than one will be used, with a warning. value. a value to be assigned to x. pos. where to do the assignment. By default, assigns into the current environment. See 'Details' for other possibilities.