Visit the ColdFusionProNews Directory
Beginners
Installation, Coding Techniques, Data Structures...
CF Powered Sites
Stores, Directories, Universities...
Developers
Developers, Designers, Experts...
E-commerce
Shoes, Food, Music...
Education
Books, Online Resource, Languages...
Expert
LiveChat , CFX XMLParser , User Defined Function Library...
Hosting
Dedicated Servers, Virtual Server, Multi-user...
Intermediate
AutoResize , DataSource Encryption, Guestbook...

Submit your site for FREE

Why You Might Use StructCopy() Instead of Duplicate()


Sean Corfield By: Sean Corfield

Over on Will Tomlinson’s blog there’s a piece about using structCopy() to create a copy of a struct and a note from Charlie Griefer cautioning that for Will’s example, he probably needed to use duplicate() instead. After discussing this will Will on IM, I figured it might be instructive to look at how structCopy() differs from duplicate() and why you might use it instead.

First off, let me say that the reason I think this causes confusion for a lot of CFers is that they don’t have a Computer Science background so they’ve not had the “Memory and Pointers 101″ course that makes this stuff a lot clearer. Hopefully, this blog post will help fill in some of the gaps.

Some basics. When you assign something to a variable in CFML, you are really doing two things: you are creating a label (the variable name) and you are allocating some memory to associate the label with the data. In particular, with structs, the struct itself exists in a block of memory (well, lots of connected blocks of memory) and then the variable “points to” the struct data.

Let’s start with the simplest example (these examples require ColdFusion 8.0.1):

writeOutput(“Assign (by reference):<br />);
var1 = { a = 1, b = { c = 2 }, d = 3 };
var2 = var1;
var2.a = 4; // affects both dump(var1=var1,var2=var2);

Assume a dump function like this:

<cffunction name=“dump”>
<cfdump label=“arguments” var=“#arguments#” />
</cffunction>

OK, so in the above code, var1 points to a struct that contains two top-level keys (a, d) and a nested struct (b, which points to a struct containing c). When the assignment (var2 = var1;) is executed, var2 is made to point to the same thing that var1 points at. In other words, var1 and var2 are synonyms. When you change the struct data through var2, it modifies the single, shared copy of the struct.

Now, let’s look at a common idiom:

writeOutput(“StructNew/StructAppend (equivalent to structCopy):<br />);
var1 = { a = 1, b = { c = 2 }, d = 3 };
var2 = structNew();
structAppend(var2,var1);
var2.a = 4; // does not affect var1 dump(var1=var1,var2=var2);
var2.b.c = 5; // affects both dump(var1=var1,var2=var2);

In this code, var1 points to the struct data and var2 is set to point to a new, empty struct. The structAppend() call copies the (top-level) elements from the first struct to the second struct. At that point, var1 points to a struct that contains two top-level keys (a, d) and a nested struct (b, which points to a struct containing c); var2 points to a (separate) struct that contains two (new) top-level keys (a, d) and a nested struct (b, which points to the same struct data as the first b under var1). Let’s look at that again: structAppend() copies in a, b and d as keys to the new struct as if it had done:

var2.a = var1.a;
var2.b = var1.b;
var2.d = var1.d;

We can see that var2.b is made to point to the same struct data as var1.b just as the direct assignment of var1 to var2 did in the first example above. When we assign var2.a = 4; we are updating the value associated with the key a in the struct pointed to by var2. Since a was copied into var2, it’s a different key entry to the a in var1. When we assign var2.b.c = 5; we are reaching into the shared struct data that both var1.b and var2.b point to and updating it. That’s why that change appears in both dumps - because there’s only one instance of that struct, pointed to by both of the top-level structs.

Now, what about structCopy()? It does exactly what the structNew() / structAppend() combination does. It creates a new top-level structure and populates it with the keys from the original struct. Any nested structs (or objects) will end up being shared between the original and the “copy”. Here:

writeOutput(“StructCopy:<br />);
var1 = { a = 1, b = { c = 2 }, d = 3 };
var2 = structCopy(var1);
var2.a = 4; // does not affect var1 dump(var1=var1,var2=var2);
var2.b.c = 5; // affects both dump(var1=var1,var2=var2);

If you want a complete, separate copy, you need to use duplicate() which will walk the entire data structure and create a brand new copy of every level within it:

writeOutput(“Duplicate:<br />);
var1 = { a = 1, b = { c = 2 }, d = 3 };
var2 = duplicate(var1);
var2.a = 4; // does not affect var1 dump(var1=var1,var2=var2);
var2.b.c = 5; // does not affect var1 dump(var1=var1,var2=var2);

The duplicate() call not only copies the top-level struct (as shown above) but also the nested struct, so that var2.b points to a new struct that is a copy of var1.b. Thus var2.b.c is completely separate from var1.b.c.

As the documentation for structCopy() says “Copies top-level keys, values, and arrays in the structure by value; copies nested structures by reference.” and it goes on to say “To copy a structure entirely by value, use Duplicate.” So why would you ever use structCopy()? You probably don’t need it very often but bear in mind how it works compared to structAppend() and how often you use that function. If you have a struct containing CFCs, you may well not want to duplicate() the CFCs (remember: duplicate() does a full deep copy of CFCs now which won’t be correct if your CFC refers to a singleton, e.g., TransferObject CFCs if you’re using Transfer). If your struct is just a container for data and you don’t need the data itself to be copied (i.e., a new copy made), then structCopy() is what you want. If your struct can contain CFCs, think very carefully about the impact of using duplicate() - again, structCopy() may be what you want.

Understanding copy-by-value vs copy-by-reference is very important when dealing with complex data structures.

Comments

About The Author

Sean is currently Senior Computer Scientist and Team Lead in the Hosted Services group at Adobe Systems Incorporated. He has worked in the IT industry for nearly twenty-five years, first in database systems and compilers (serving eight years on the ANSI C++ Standards Committee), then in mobile telecoms, and finally in web development. Sean is a staunch advocate of software standards and best practices, and is a well-known and respected speaker on these subjects. Sean has championed and contributed to a number of ColdFusion frameworks, and is a frequent publisher on his blog, http://corfield.org/

Leave a Reply